[MDEV-13779] InnoDB fails to shut down purge workers, causing hang Created: 2017-09-11  Updated: 2018-05-25  Resolved: 2018-05-16

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2.3, 10.3.0
Fix Version/s: 10.2.16, 10.3.8

Type: Bug Priority: Major
Reporter: Alice Sherepa Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: hang, shutdown

Issue Links:
Relates
relates to MDEV-15554 InnoDB page_cleaner shutdown sometime... Closed
relates to MDEV-11802 innodb.innodb_bug14676111 fails in bu... Closed
relates to MDEV-14379 encryption.innodb_encrypt_log_corrup... Closed
relates to MDEV-15817 encryption tests fail futex(COND_thr... Open

 Description   

innodb_fts.innodb_fts_plugin 'innodb' w2 [ fail ] Found warnings/errors in server log file!
http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest/builds/10121/steps/test_2/logs/mysqld.1.err.2

CURRENT_TEST: innodb_fts.innodb_fts_plugin
2017-09-08 14:13:13 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld (root[root] @ localhost []): Normal shutdown
 
2017-09-08 14:13:13 0 [Note] InnoDB: FTS optimize thread exiting.
2017-09-08 14:13:13 0 [Note] Event Scheduler: Purging the queue. 0 events
2017-09-08 14:13:13 0 [Note] InnoDB: Starting shutdown...
2017-09-08 14:13:13 0 [Note] InnoDB: Dumping buffer pool(s) to /dev/shm/var_auto_Jb6A/2/mysqld.1/data/ib_buffer_pool
2017-09-08 14:13:13 0 [Note] InnoDB: Instance 0, restricted to 127 pages due to innodb_buf_pool_dump_pct=25
2017-09-08 14:13:13 0 [Note] InnoDB: Buffer pool(s) dump completed at 170908 14:13:13
2017-09-08 14:13:14 0 [Note] InnoDB: Shutdown completed; log sequence number 167888018
2017-09-08 14:13:14 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2017-09-08 14:13:14 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: Shutdown complete
 
"2017-09-08 14:13:16 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld (mysqld 10.3.2-MariaDB-log) starting as process 5693 ...
2017-09-08 14:13:16 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4162)
2017-09-08 14:13:16 0 [Note] Plugin 'partition' is disabled.
2017-09-08 14:13:16 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-08 14:13:16 0 [Note] InnoDB: Uses event mutexes
2017-09-08 14:13:16 0 [Note] InnoDB: Compressed tables use zlib 1.2.3
2017-09-08 14:13:16 0 [Note] InnoDB: Using Linux native AIO
2017-09-08 14:13:16 0 [Note] InnoDB: Number of pools: 1
2017-09-08 14:13:16 0 [Note] InnoDB: Using generic crc32 instructions
2017-09-08 14:13:16 0 [Note] InnoDB: Initializing buffer pool, total size = 8M, instances = 1, chunk size = 8M
2017-09-08 14:13:16 0 [Note] InnoDB: Completed initialization of buffer pool
2017-09-08 14:13:16 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2017-09-08 14:13:16 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
2017-09-08 14:13:16 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2017-09-08 14:13:16 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2017-09-08 14:13:16 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2017-09-08 14:13:16 0 [Note] InnoDB: Waiting for purge to start
2017-09-08 14:13:16 0 [Note] InnoDB: 5.7.19 started; log sequence number 167888018
2017-09-08 14:13:16 0 [Note] Plugin 'SEQUENCE' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_SYS_DATAFILES' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_SYS_TABLESTATS' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_TABLESPACES_SCRUBBING' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_CMP' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_DELETED' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_CMP_RESET' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_LOCK_WAITS' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_TABLESPACES_ENCRYPTION' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_CMPMEM_RESET' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'FEEDBACK' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_INDEX_TABLE' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_CMP_PER_INDEX_RESET' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'user_variables' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_INDEX_CACHE' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_BEING_DELETED' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_DEFAULT_STOPWORD' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_FT_CONFIG' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_SYS_TABLESPACES' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_MUTEXES' is disabled.
2017-09-08 14:13:16 0 [Note] Plugin 'INNODB_SYS_SEMAPHORE_WAITS' is disabled.
2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown variable 'loose-feedback-debug-startup-interval=20'
2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown variable 'loose-feedback-debug-first-interval=60'
2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown variable 'loose-feedback-debug-interval=60'
2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown option '--loose-pam-debug'
2017-09-08 14:13:16 0 [Warning] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: unknown variable 'loose-debug-sync-timeout=300'
2017-09-08 14:13:16 0 [Note] InnoDB: Loading buffer pool(s) from /dev/shm/var_auto_Jb6A/2/mysqld.1/data/ib_buffer_pool
2017-09-08 14:13:16 0 [Note] InnoDB: Buffer pool(s) load completed at 170908 14:13:16
2017-09-08 14:13:16 0 [Note] Server socket created on IP: '127.0.0.1'.
2017-09-08 14:13:16 0 [Note] Reading of all Master_info entries succeded
2017-09-08 14:13:16 0 [Note] Added new Master_info '' to hash table
2017-09-08 14:13:16 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld: ready for connections.
Version: '10.3.2-MariaDB-log'  socket: '/mnt/buildbot/build/mariadb-10.3.2/mysql-test/var/tmp/2/mysqld.1.sock'  port: 16020  MariaDB Server
2017-09-08 14:13:16 0 [Note] /mnt/buildbot/build/mariadb-10.3.2/sql/mysqld (root[root] @ localhost []): Normal shutdown
 
2017-09-08 14:13:16 0 [Note] Event Scheduler: Purging the queue. 0 events
2017-09-08 14:13:16 0 [Note] InnoDB: FTS optimize thread exiting.
170908 14:14:16 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.3.2-MariaDB-log
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=3
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63038 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/mnt/buildbot/build/mariadb-10.3.2/sql/mysqld(my_print_stacktrace+0x2e)[0x55d16586fcee]
/mnt/buildbot/build/mariadb-10.3.2/sql/mysqld(handle_fatal_signal+0x307)[0x55d165308f67]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f123a020390]
/lib/x86_64-linux-gnu/libpthread.so.0(pthread_cond_wait+0xc0)[0x7f123a01c360]
/mnt/buildbot/build/mariadb-10.3.2/sql/mysqld(_Z11mysqld_mainiPPc+0x1232)[0x55d1650b0372]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f12395db830]
/mnt/buildbot/build/mariadb-10.3.2/sql/mysqld(_start+0x29)[0x55d1650a2b89]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.



 Comments   
Comment by Alice Sherepa [ 2017-11-21 ]

the same with innodb.innodb_zip_innochecksum test http://buildbot.askmonty.org/buildbot/builders/kvm-deb-zesty-x86/builds/949/steps/mtr/logs/stdio
http://buildbot.askmonty.org/buildbot/builders/kvm-deb-zesty-x86/builds/949/steps/mtr/logs/mysqld.1.err.1

innodb.innodb_zip_innochecksum '8k,innodb' w1 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2017-11-18 22:39:26
line
Attempting backtrace. You can use the following information to find out
^ Found warnings in /dev/shm/var/1/log/mysqld.1.err

2017-11-18 22:38:20 0 [Note] /usr/sbin/mysqld (root[root] @ localhost []): Normal shutdown
 
2017-11-18 22:38:20 0 [Note] Event Scheduler: Purging the queue. 0 events
2017-11-18 22:38:20 0 [Note] InnoDB: FTS optimize thread exiting.
171118 22:39:20 [ERROR] mysqld got signal 6 ;
 
Server version: 10.3.3-MariaDB-10.3.3+maria~zesty-log
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61868 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/usr/sbin/mysqld(my_print_stacktrace+0x28)[0x80b9c8c8]
/usr/sbin/mysqld(handle_fatal_signal+0x3c9)[0x805f23a9]
[0xb7704d04]
[0xb7704cf9]
/lib/i386-linux-gnu/libpthread.so.0(pthread_cond_wait+0xdc)[0xb7381c5c]
/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x139d)[0x8036e50d]
/usr/sbin/mysqld(main+0x27)[0x8034c7d7]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf6)[0xb6ffd276]
/usr/sbin/mysqld(+0x26906c)[0x8036006c]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Comment by Marko Mäkelä [ 2018-04-23 ]

Maybe this is related to the THD object destruction problems at shutdown, reported in MDEV-13644?

Comment by Marko Mäkelä [ 2018-04-27 ]

Based on the timestamps, the server hangs on shutdown. The deadly signal is caught exactly one minute after the previous message. In 10.3, mysqltest.cc kills the process with SIGABRT, giving it a chance to dump (one) stack trace.

The shutdown hang could have been a duplicate of MDEV-15554, which I fixed in 10.2.14 and 10.3.6 on 2018-03-13.

Comment by Marko Mäkelä [ 2018-04-28 ]

We still have a shutdown hang, this time in the test encryption.innochecksum:

bb-10.3-marko b82c811907c802a166e436d88c57e5b82cf8236f (based on 10.3 6c5e60f1b198ff846ee5595927ff5a74c6b08e91)

encryption.innochecksum '4k,cbc,innodb'  w2 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2018-04-27 23:33:14
line
Attempting backtrace. You can use the following information to find out
^ Found warnings in /usr/local/mariadb-10.3.7-linux-i686/mysql-test/var/2/log/mysqld.1.err

The server error log shows the following:

Version: '10.3.7-MariaDB-log'  socket: '/usr/local/mariadb-10.3.7-linux-i686/mysql-test/var/tmp/2/mysqld.1.sock'  port: 16020  MariaDB Server
2018-04-27 23:32:10 0 [Note] /usr/local/mariadb-10.3.7-linux-i686/bin/mysqld (initiated by: root[root] @ localhost []): Normal shutdown
2018-04-27 23:32:10 0 [Note] InnoDB: FTS optimize thread exiting.
2018-04-27 23:32:10 0 [Note] Event Scheduler: Purging the queue. 0 events
180427 23:33:11 [ERROR] mysqld got signal 6 ;

This seems to be the first (and only) shutdown in that test. I do not understand why that particular test should restart the server at all. But it shows that we still have a sporadic server shutdown hang.

Comment by Marko Mäkelä [ 2018-05-16 ]

Luckily, I got a shutdown hang locally. One srv_worker_thread failed to exit, even though srv_purge_coordinator_thread should ensure it. I added one more check to the thread to ensure that all workers have actually deregistered.

MDEV-5800 or one of its follow-up fixes caused the zombie purge worker threads to matter more, because the shutdown would wait for all threads to execute innobase_destroy_background_thd(thd). And the zombie purge worker threads would never destroy them.

Comment by Marko Mäkelä [ 2018-05-24 ]

There was an assertion failure on MariaDB 10.2 shutdown on Windows, because thd_destructor_proxy() failed to shut down the purge threads:

10.2 52df8040264cc6f3226e78d371e4c1dd88dd22b6

Assertion failed: !"purge was not shut down", file D:\winx64-debug\build\src\storage\innobase\log\log0log.cc, line 2001

To fix this, I introduced a function srv_purge_shutdown() that will actually wait for the purge threads to exit.

Comment by Marko Mäkelä [ 2018-05-25 ]

There were 2 commits to fix the issue, over a week apart. The first one went to MariaDB 10.3.7, but the second one slipped to 10.3.8. MariaDB 10.2.16 will contain both commits.

Generated at Thu Feb 08 08:08:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.