Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13779

InnoDB fails to shut down purge workers, causing hang

Details

    Description

      innodb_fts.innodb_fts_plugin 'innodb' w2 [ fail ] Found warnings/errors in server log file!
      http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest/builds/10121/steps/test_2/logs/mysqld.1.err.2

      CURRENT_TEST: innodb_fts.innodb_fts_plugin
      2017-09-08 14:13:13 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld (root[root] @ localhost []): Normal shutdown
       
      2017-09-08 14:13:13 0 [Note] InnoDB: FTS optimize thread exiting.
      2017-09-08 14:13:13 0 [Note] Event Scheduler: Purging the queue. 0 events
      2017-09-08 14:13:13 0 [Note] InnoDB: Starting shutdown...
      2017-09-08 14:13:13 0 [Note] InnoDB: Dumping buffer pool(s) to /dev/shm/var_auto_Jb6A/2/mysqld.1/data/ib_buffer_pool
      2017-09-08 14:13:13 0 [Note] InnoDB: Instance 0, restricted to 127 pages due to innodb_buf_pool_dump_pct=25
      2017-09-08 14:13:13 0 [Note] InnoDB: Buffer pool(s) dump completed at 170908 14:13:13
      2017-09-08 14:13:14 0 [Note] InnoDB: Shutdown completed; log sequence number 167888018
      2017-09-08 14:13:14 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
      2017-09-08 14:13:14 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: Shutdown complete
       
      "2017-09-08 14:13:16 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld (mysqld 10.3.2-MariaDB-log) starting as process 5693 ...
      2017-09-08 14:13:16 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4162)
      2017-09-08 14:13:16 0 [Note] Plugin 'partition' is disabled.
      2017-09-08 14:13:16 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2017-09-08 14:13:16 0 [Note] InnoDB: Uses event mutexes
      2017-09-08 14:13:16 0 [Note] InnoDB: Compressed tables use zlib 1.2.3
      2017-09-08 14:13:16 0 [Note] InnoDB: Using Linux native AIO
      2017-09-08 14:13:16 0 [Note] InnoDB: Number of pools: 1
      2017-09-08 14:13:16 0 [Note] InnoDB: Using generic crc32 instructions
      2017-09-08 14:13:16 0 [Note] InnoDB: Initializing buffer pool, total size = 8M, instances = 1, chunk size = 8M
      2017-09-08 14:13:16 0 [Note] InnoDB: Completed initialization of buffer pool
      2017-09-08 14:13:16 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2017-09-08 14:13:16 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
      2017-09-08 14:13:16 0 [Note] InnoDB: Creating shared tablespace for temporary tables
      2017-09-08 14:13:16 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      2017-09-08 14:13:16 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
      2017-09-08 14:13:16 0 [Note] InnoDB: Waiting for purge to start
      2017-09-08 14:13:16 0 [Note] InnoDB: 5.7.19 started; log sequence number 167888018
      2017-09-08 14:13:16 0 [Note] Plugin 'SEQUENCE' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_SYS_DATAFILES' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_SYS_TABLESTATS' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_TABLESPACES_SCRUBBING' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_CMP' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_DELETED' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_CMP_RESET' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_LOCK_WAITS' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_TABLESPACES_ENCRYPTION' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_CMPMEM_RESET' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'FEEDBACK' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_INDEX_TABLE' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_CMP_PER_INDEX_RESET' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'user_variables' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_INDEX_CACHE' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_BEING_DELETED' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_DEFAULT_STOPWORD' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_CONFIG' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_SYS_TABLESPACES' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_MUTEXES' is disabled.
      2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_SYS_SEMAPHORE_WAITS' is disabled.
      2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown variable 'loose-feedback-debug-startup-interval=20'
      2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown variable 'loose-feedback-debug-first-interval=60'
      2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown variable 'loose-feedback-debug-interval=60'
      2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown option '--loose-pam-debug'
      2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown variable 'loose-debug-sync-timeout=300'
      2017-09-08 14:13:16 0 [Note] InnoDB: Loading buffer pool(s) from /dev/shm/var_auto_Jb6A/2/mysqld.1/data/ib_buffer_pool
      2017-09-08 14:13:16 0 [Note] InnoDB: Buffer pool(s) load completed at 170908 14:13:16
      2017-09-08 14:13:16 0 [Note] Server socket created on IP: '127.0.0.1'.
      2017-09-08 14:13:16 0 [Note] Reading of all Master_info entries succeded
      2017-09-08 14:13:16 0 [Note] Added new Master_info '' to hash table
      2017-09-08 14:13:16 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: ready for connections.
      Version: '10.3.2-MariaDB-log'  socket: '/mnt/buildbot/build/mariadb-10.3.2/mysql-test/var/tmp/2/mysqld.1.sock'  port: 16020  MariaDB Server
      2017-09-08 14:13:16 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld (root[root] @ localhost []): Normal shutdown
       
      2017-09-08 14:13:16 0 [Note] Event Scheduler: Purging the queue. 0 events
      2017-09-08 14:13:16 0 [Note] InnoDB: FTS optimize thread exiting.
      170908 14:14:16 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.3.2-MariaDB-log
      key_buffer_size=1048576
      read_buffer_size=131072
      max_used_connections=1
      max_threads=153
      thread_count=3
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63038 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld(my_print_stacktrace+0x2e)[0x55d16586fcee]
      /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld(handle_fatal_signal+0x307)[0x55d165308f67]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f123a020390]
      /lib/x86_64-linux-gnu/libpthread.so.0(pthread_cond_wait+0xc0)[0x7f123a01c360]
      /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld(_Z11mysqld_mainiPPc+0x1232)[0x55d1650b0372]
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f12395db830]
      /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld(_start+0x29)[0x55d1650a2b89]
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      

      Attachments

        Issue Links

          Activity

            alice Alice Sherepa added a comment -

            the same with innodb.innodb_zip_innochecksum test http://buildbot.askmonty.org/buildbot/builders/kvm-deb-zesty-x86/builds/949/steps/mtr/logs/stdio
            http://buildbot.askmonty.org/buildbot/builders/kvm-deb-zesty-x86/builds/949/steps/mtr/logs/mysqld.1.err.1

            innodb.innodb_zip_innochecksum '8k,innodb' w1 [ fail ]  Found warnings/errors in server log file!
                    Test ended at 2017-11-18 22:39:26
            line
            Attempting backtrace. You can use the following information to find out
            ^ Found warnings in /dev/shm/var/1/log/mysqld.1.err
            

            2017-11-18 22:38:20 0 [Note] /usr/sbin/mysqld (root[root] @ localhost []): Normal shutdown
             
            2017-11-18 22:38:20 0 [Note] Event Scheduler: Purging the queue. 0 events
            2017-11-18 22:38:20 0 [Note] InnoDB: FTS optimize thread exiting.
            171118 22:39:20 [ERROR] mysqld got signal 6 ;
             
            Server version: 10.3.3-MariaDB-10.3.3+maria~zesty-log
            key_buffer_size=1048576
            read_buffer_size=131072
            max_used_connections=1
            max_threads=153
            thread_count=1
            It is possible that mysqld could use up to 
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61868 K  bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.
             
            Thread pointer: 0x0
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x0 thread_stack 0x49000
            /usr/sbin/mysqld(my_print_stacktrace+0x28)[0x80b9c8c8]
            /usr/sbin/mysqld(handle_fatal_signal+0x3c9)[0x805f23a9]
            [0xb7704d04]
            [0xb7704cf9]
            /lib/i386-linux-gnu/libpthread.so.0(pthread_cond_wait+0xdc)[0xb7381c5c]
            /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x139d)[0x8036e50d]
            /usr/sbin/mysqld(main+0x27)[0x8034c7d7]
            /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf6)[0xb6ffd276]
            /usr/sbin/mysqld(+0x26906c)[0x8036006c]
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            

            alice Alice Sherepa added a comment - the same with innodb.innodb_zip_innochecksum test http://buildbot.askmonty.org/buildbot/builders/kvm-deb-zesty-x86/builds/949/steps/mtr/logs/stdio http://buildbot.askmonty.org/buildbot/builders/kvm-deb-zesty-x86/builds/949/steps/mtr/logs/mysqld.1.err.1 innodb.innodb_zip_innochecksum '8k,innodb' w1 [ fail ] Found warnings/errors in server log file! Test ended at 2017-11-18 22:39:26 line Attempting backtrace. You can use the following information to find out ^ Found warnings in /dev/shm/var/1/log/mysqld.1.err 2017-11-18 22:38:20 0 [Note] /usr/sbin/mysqld (root[root] @ localhost []): Normal shutdown   2017-11-18 22:38:20 0 [Note] Event Scheduler: Purging the queue. 0 events 2017-11-18 22:38:20 0 [Note] InnoDB: FTS optimize thread exiting. 171118 22:39:20 [ERROR] mysqld got signal 6 ;   Server version: 10.3.3-MariaDB-10.3.3+maria~zesty-log key_buffer_size=1048576 read_buffer_size=131072 max_used_connections=1 max_threads=153 thread_count=1 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61868 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x0 thread_stack 0x49000 /usr/sbin/mysqld(my_print_stacktrace+0x28)[0x80b9c8c8] /usr/sbin/mysqld(handle_fatal_signal+0x3c9)[0x805f23a9] [0xb7704d04] [0xb7704cf9] /lib/i386-linux-gnu/libpthread.so.0(pthread_cond_wait+0xdc)[0xb7381c5c] /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x139d)[0x8036e50d] /usr/sbin/mysqld(main+0x27)[0x8034c7d7] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf6)[0xb6ffd276] /usr/sbin/mysqld(+0x26906c)[0x8036006c] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash.

            Maybe this is related to the THD object destruction problems at shutdown, reported in MDEV-13644?

            marko Marko Mäkelä added a comment - Maybe this is related to the THD object destruction problems at shutdown, reported in MDEV-13644 ?

            Based on the timestamps, the server hangs on shutdown. The deadly signal is caught exactly one minute after the previous message. In 10.3, mysqltest.cc kills the process with SIGABRT, giving it a chance to dump (one) stack trace.

            The shutdown hang could have been a duplicate of MDEV-15554, which I fixed in 10.2.14 and 10.3.6 on 2018-03-13.

            marko Marko Mäkelä added a comment - Based on the timestamps, the server hangs on shutdown. The deadly signal is caught exactly one minute after the previous message. In 10.3, mysqltest.cc kills the process with SIGABRT, giving it a chance to dump (one) stack trace. The shutdown hang could have been a duplicate of MDEV-15554 , which I fixed in 10.2.14 and 10.3.6 on 2018-03-13.

            We still have a shutdown hang, this time in the test encryption.innochecksum:

            bb-10.3-marko b82c811907c802a166e436d88c57e5b82cf8236f (based on 10.3 6c5e60f1b198ff846ee5595927ff5a74c6b08e91)

            encryption.innochecksum '4k,cbc,innodb'  w2 [ fail ]  Found warnings/errors in server log file!
                    Test ended at 2018-04-27 23:33:14
            line
            Attempting backtrace. You can use the following information to find out
            ^ Found warnings in /usr/local/mariadb-10.3.7-linux-i686/mysql-test/var/2/log/mysqld.1.err
            

            The server error log shows the following:

            Version: '10.3.7-MariaDB-log'  socket: '/usr/local/mariadb-10.3.7-linux-i686/mysql-test/var/tmp/2/mysqld.1.sock'  port: 16020  MariaDB Server
            2018-04-27 23:32:10 0 [Note] /usr/local/mariadb-10.3.7-linux-i686/bin/mysqld (initiated by: root[root] @ localhost []): Normal shutdown
            2018-04-27 23:32:10 0 [Note] InnoDB: FTS optimize thread exiting.
            2018-04-27 23:32:10 0 [Note] Event Scheduler: Purging the queue. 0 events
            180427 23:33:11 [ERROR] mysqld got signal 6 ;
            

            This seems to be the first (and only) shutdown in that test. I do not understand why that particular test should restart the server at all. But it shows that we still have a sporadic server shutdown hang.

            marko Marko Mäkelä added a comment - We still have a shutdown hang, this time in the test encryption.innochecksum : bb-10.3-marko b82c811907c802a166e436d88c57e5b82cf8236f (based on 10.3 6c5e60f1b198ff846ee5595927ff5a74c6b08e91) encryption.innochecksum '4k,cbc,innodb' w2 [ fail ] Found warnings/errors in server log file! Test ended at 2018-04-27 23:33:14 line Attempting backtrace. You can use the following information to find out ^ Found warnings in /usr/local/mariadb-10.3.7-linux-i686/mysql-test/var/2/log/mysqld.1.err The server error log shows the following: Version: '10.3.7-MariaDB-log' socket: '/usr/local/mariadb-10.3.7-linux-i686/mysql-test/var/tmp/2/mysqld.1.sock' port: 16020 MariaDB Server 2018-04-27 23:32:10 0 [Note] /usr/local/mariadb-10.3.7-linux-i686/bin/mysqld (initiated by: root[root] @ localhost []): Normal shutdown 2018-04-27 23:32:10 0 [Note] InnoDB: FTS optimize thread exiting. 2018-04-27 23:32:10 0 [Note] Event Scheduler: Purging the queue. 0 events 180427 23:33:11 [ERROR] mysqld got signal 6 ; This seems to be the first (and only) shutdown in that test. I do not understand why that particular test should restart the server at all. But it shows that we still have a sporadic server shutdown hang.

            Luckily, I got a shutdown hang locally. One srv_worker_thread failed to exit, even though srv_purge_coordinator_thread should ensure it. I added one more check to the thread to ensure that all workers have actually deregistered.

            MDEV-5800 or one of its follow-up fixes caused the zombie purge worker threads to matter more, because the shutdown would wait for all threads to execute innobase_destroy_background_thd(thd). And the zombie purge worker threads would never destroy them.

            marko Marko Mäkelä added a comment - Luckily, I got a shutdown hang locally. One srv_worker_thread failed to exit, even though srv_purge_coordinator_thread should ensure it. I added one more check to the thread to ensure that all workers have actually deregistered. MDEV-5800 or one of its follow-up fixes caused the zombie purge worker threads to matter more, because the shutdown would wait for all threads to execute innobase_destroy_background_thd(thd) . And the zombie purge worker threads would never destroy them.

            There was an assertion failure on MariaDB 10.2 shutdown on Windows, because thd_destructor_proxy() failed to shut down the purge threads:

            10.2 52df8040264cc6f3226e78d371e4c1dd88dd22b6

            Assertion failed: !"purge was not shut down", file D:\winx64-debug\build\src\storage\innobase\log\log0log.cc, line 2001
            

            To fix this, I introduced a function srv_purge_shutdown() that will actually wait for the purge threads to exit.

            marko Marko Mäkelä added a comment - There was an assertion failure on MariaDB 10.2 shutdown on Windows , because thd_destructor_proxy()  failed to shut down the purge threads: 10.2 52df8040264cc6f3226e78d371e4c1dd88dd22b6 Assertion failed: !"purge was not shut down", file D:\winx64-debug\build\src\storage\innobase\log\log0log.cc, line 2001 To fix this, I introduced a function srv_purge_shutdown() that will actually wait for the purge threads to exit .

            There were 2 commits to fix the issue, over a week apart. The first one went to MariaDB 10.3.7, but the second one slipped to 10.3.8. MariaDB 10.2.16 will contain both commits.

            marko Marko Mäkelä added a comment - There were 2 commits to fix the issue, over a week apart. The first one went to MariaDB 10.3.7, but the second one slipped to 10.3.8. MariaDB 10.2.16 will contain both commits.

            People

              marko Marko Mäkelä
              alice Alice Sherepa
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.