Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8069

DROP or rebuild of a large table may lock up InnoDB

Details

    • 10.3.1-1

    Description

      DROP DATABASE IF EXSITS executed on database with more that several thousand tables and over several TBs data in it is unreasonably slow.
      A huge amount of writing to disk has been observed too.

      Crash description:
      executed:

      SET unique_checks = 0; SET foreign_key_checks = 0; SET GLOBAL innodb_stats_on_metadata = 0; DROP DATABASE IF EXISTS HUGE_TABLE'

      InnoDB: Error: semaphore wait has lasted > 600 seconds
      InnoDB: We intentionally crash the server, because it appears to be hung.
      2015-04-27 12:21:26 7f07de3ff700  InnoDB: Assertion failure in thread 139671770232576 in file srv0srv.cc line 2196
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
      InnoDB: about forcing recovery.
      150427 12:21:26 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see http://kb.askmonty.org/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.0.16-MariaDB-log
      key_buffer_size=53687091200
      read_buffer_size=131072
      max_used_connections=42
      max_threads=402
      thread_count=21
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 52900205 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x48000
      /usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb73d3b]
      /usr/sbin/mysqld(handle_fatal_signal+0x398)[0x726518]
      /lib64/libpthread.so.0(+0xf710)[0x7f5096c43710]
      /lib64/libc.so.6(gsignal+0x35)[0x7f509529f625]
      /lib64/libc.so.6(abort+0x175)[0x7f50952a0e05]
      /usr/sbin/mysqld[0x936f9c]
      /lib64/libpthread.so.0(+0x79d1)[0x7f5096c3b9d1]
      /lib64/libc.so.6(clone+0x6d)[0x7f50953558fd]
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      150427 12:48:54 mysqld_safe Number of processes running now: 0
      150427 12:48:54 mysqld_safe mysqld restarted
      150427 12:49:16 [Note] InnoDB: Using mutexes to ref count buffer pool pages
      150427 12:49:16 [Note] InnoDB: The InnoDB memory heap is disabled
      150427 12:49:16 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      150427 12:49:16 [Note] InnoDB: Memory barrier is not used
      150427 12:49:16 [Note] InnoDB: Compressed tables use zlib 1.2.3
      150427 12:49:16 [Note] InnoDB: Using Linux native AIO
      150427 12:49:16 [Note] InnoDB: Using CPU crc32 instructions
      150427 12:49:16 [Note] InnoDB: Initializing buffer pool, size = 225.0G
      150427 12:49:27 [Note] InnoDB: Completed initialization of buffer pool
      150427 12:49:29 [Note] InnoDB: Highest supported file format is Barracuda.
      150427 12:49:29 [Note] InnoDB: Log scan progressed past the checkpoint lsn 277389716102041
      150427 12:49:29 [Note] InnoDB: Database was not shutdown normally!
      150427 12:49:29 [Note] InnoDB: Starting crash recovery.
      150427 12:49:29 [Note] InnoDB: Reading tablespace information from the .ibd files...
      (END)
       

      Please do not ignore the Feature Request for optimizing the DROP DATABASE operation, even this ticket is marked as a bug.

      Attachments

        Issue Links

          Activity

            MySQL Bug #91977 could report the same problem.

            marko Marko Mäkelä added a comment - MySQL Bug #91977 could report the same problem.

            There are two issues affecting DROP or rebuild operations of large partitions, tables or databases. (Tables or partitions are internally dropped as part of operations that rebuild the table or partition: TRUNCATE and some forms of OPTIMIZE or ALTER TABLE, sometimes even CREATE INDEX or DROP INDEX.)

            The problem reported in this ticket is that the InnoDB data dictionary cache (dict_sys) is being locked while the data file is being deleted, and deleting a large data file may take a lot of time, especially for a fragmented data file.

            A related problem affects not only dropping or rebuilding entire tables or partitions, but also DROP INDEX operations that are executed in-place:
            MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDB

            marko Marko Mäkelä added a comment - There are two issues affecting DROP or rebuild operations of large partitions, tables or databases. (Tables or partitions are internally dropped as part of operations that rebuild the table or partition: TRUNCATE and some forms of OPTIMIZE or ALTER TABLE , sometimes even CREATE INDEX or DROP INDEX .) The problem reported in this ticket is that the InnoDB data dictionary cache ( dict_sys ) is being locked while the data file is being deleted, and deleting a large data file may take a lot of time, especially for a fragmented data file. A related problem affects not only dropping or rebuilding entire tables or partitions, but also DROP INDEX operations that are executed in-place: MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDB

            MDEV-22456 recently removed one bottleneck already. We still have 2 remaining:

            InnoDB is invoking unlink() to delete the data file while holding some mutexes. This must be fixed as part of this ticket.

            On some file systems (most notably, on Linux), unlink() of a large file can block any concurrent usage of the entire file system. A workaround for this may be implemented in MDEV-18613.

            marko Marko Mäkelä added a comment - MDEV-22456 recently removed one bottleneck already. We still have 2 remaining: InnoDB is invoking unlink() to delete the data file while holding some mutexes. This must be fixed as part of this ticket. On some file systems (most notably, on Linux), unlink() of a large file can block any concurrent usage of the entire file system. A workaround for this may be implemented in MDEV-18613 .

            I think that we should try to rely on delete-on-close semantics. That is, copy the handle to the to-be-deleted file, hold it open across the deletion, and close the handle after releasing the dict_sys latches. As far as I understand, file system recovery should guarantee that the file be deleted.

            marko Marko Mäkelä added a comment - I think that we should try to rely on delete-on-close semantics. That is, copy the handle to the to-be-deleted file, hold it open across the deletion, and close the handle after releasing the dict_sys latches. As far as I understand, file system recovery should guarantee that the file be deleted.

            Results of RQG testing
            -----------------------------------
            origin/bb-10.5-kevgs 3b08527d8c0907b06dad4179b009ca76efdd4aad 2020-06-09T04:34:10+03:00  containing MDEV-8069, Build with ASAN
            versus
            ~ actual 10.5, Build with ASAN
            1.  Test battery for broad range coverage
                  The tree containing MDEV-8069 performed neither better nor worse than actual 10.5
            2. Some new test where concurrent connections fiddle (CREATE OR REPLACE/ALTER/DROP)
                 with one 500000 rows table and several 1000 rows tables per connection
                 innodb_fatal_semaphore_wait_threshold values tried were 2 and 200
                 100 RQG test runs per tree with binaries
                  actual 10.5 :  Two times   long semaphore wait hit      the actual value for that was 2s
                 MDEV-8069:  Four times  long semaphore wait hit      the actual value for that was 2s
                 IMHO the amount of tests does not allow to conclude that MDEV-8069 is  clear better.
            

            mleich Matthias Leich added a comment - Results of RQG testing ----------------------------------- origin/bb-10.5-kevgs 3b08527d8c0907b06dad4179b009ca76efdd4aad 2020-06-09T04:34:10+03:00 containing MDEV-8069, Build with ASAN versus ~ actual 10.5, Build with ASAN 1. Test battery for broad range coverage The tree containing MDEV-8069 performed neither better nor worse than actual 10.5 2. Some new test where concurrent connections fiddle (CREATE OR REPLACE/ALTER/DROP) with one 500000 rows table and several 1000 rows tables per connection innodb_fatal_semaphore_wait_threshold values tried were 2 and 200 100 RQG test runs per tree with binaries actual 10.5 : Two times long semaphore wait hit the actual value for that was 2s MDEV-8069: Four times long semaphore wait hit the actual value for that was 2s IMHO the amount of tests does not allow to conclude that MDEV-8069 is clear better.

            People

              kevg Eugene Kosov (Inactive)
              ivan.stoykov@skysql.com Stoykov (Inactive)
              Votes:
              5 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.