Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.5, 10.6, 10.11, 11.1(EOL), 11.2(EOL), 11.4, 11.5(EOL), 11.6(EOL)
Description
We got a hung server:
10.6 2ceadb390309f9c29786bb5ec2a6459dfe88344e |
versioning.rpl 'innodb,row,trx_id' w3 [ fail ] timeout after 900 seconds
|
Test ended at 2021-05-06 15:23:30
|
|
Test case timeout after 900 seconds
|
|
== D:/winx64-debug/build/mysql-test/var/3/log/rpl.log ==
|
include/master-slave.inc
|
|
== D:/winx64-debug/build/mysql-test/var/3/tmp/analyze-timeout-mysqld.2.err ==
|
SHOW PROCESSLIST;
|
Id User Host db Command Time State Info Progress
|
16 root localhost:38509 NULL Query 0 starting SHOW PROCESSLIST 0.000
|
|
== D:/winx64-debug/build/mysql-test/var/3/tmp/analyze-timeout-mysqld.1.err ==
|
SHOW PROCESSLIST;
|
Id User Host db Command Time State Info Progress
|
15 root localhost:37368 test Query 960 starting RESET MASTER 0.000
|
17 root localhost:38510 NULL Query 0 starting SHOW PROCESSLIST 0.000
|
|
mysqltest failed but provided no output
|
Among the stack traces, no InnoDB code is executing (only buf_flush_page_cleaner() is in an untimed wait). The culprit for the hang seems to be a thread that is executing RESET MASTER:
10.6 2ceadb390309f9c29786bb5ec2a6459dfe88344e |
server!pthread_cond_wait(struct _RTL_CONDITION_VARIABLE * cond = 0x00007fff`622137a8, struct _RTL_CRITICAL_SECTION * mutex = 0x00007fff`62213778) [D:\winx64-debug\build\src\mysys\my_wincond.c @ 91]
|
server!psi_cond_wait(struct st_mysql_cond * that = 0x00007fff`622137a8, struct st_mysql_mutex * mutex = 0x00007fff`62213778, char * file = 0x00007fff`60671dc8 "D:\winx64-debug\build\src\sql\log.cc", unsigned int line = 0x10f8) [D:\winx64-debug\build\src\mysys\my_thr_init.c @ 596]
|
server!inline_mysql_cond_wait(struct st_mysql_cond * that = 0x00007fff`622137a8, struct st_mysql_mutex * mutex = 0x00007fff`62213778, char * src_file = 0x00007fff`60671dc8 "D:\winx64-debug\build\src\sql\log.cc", unsigned int src_line = 0x10f8) [D:\winx64-debug\build\src\include\mysql\psi\mysql_thread.h @ 1070]
|
server!MYSQL_BIN_LOG::reset_logs(class THD * thd = 0x00000227`de152788, bool create_new_log = true, struct rpl_gtid * init_state = 0x00000000`00000000, unsigned int init_state_len = 0, unsigned long next_log_number = 0) [D:\winx64-debug\build\src\sql\log.cc @ 4345]
|
server!reset_master(class THD * thd = 0x00000227`de152788, struct rpl_gtid * init_state = 0x00000000`00000000, unsigned int init_state_len = 0, unsigned long next_log_number = 0) [D:\winx64-debug\build\src\sql\sql_repl.cc @ 3966]
|
server!reload_acl_and_cache(class THD * thd = 0x00000227`de152788, unsigned int64 options = 0x80, struct TABLE_LIST * tables = 0x00000000`00000000, int * write_to_binlog = 0x00000063`731edb74) [D:\winx64-debug\build\src\sql\sql_reload.cc @ 362]
|
server!mysql_execute_command(class THD * thd = 0x00000227`de152788) [D:\winx64-debug\build\src\sql\sql_parse.cc @ 5452]
|
server!mysql_parse(class THD * thd = 0x00000227`de152788, char * rawbuf = 0x00000227`de1ca690 "--- memory read error at address 0x00000227`de1ca690 ---", unsigned int length = 0xc, class Parser_state * parser_state = 0x00000063`731ee950) [D:\winx64-debug\build\src\sql\sql_parse.cc @ 8018]
|
server!dispatch_command(enum_server_command command = COM_QUERY (0n3), class THD * thd = 0x00000227`de152788, char * packet = 0x00000227`ddfceec9 "--- memory read error at address 0x00000227`ddfceec9 ---", unsigned int packet_length = 0xc, bool blocking = false) [D:\winx64-debug\build\src\sql\sql_parse.cc @ 1899]
|
server!do_command(class THD * thd = 0x00000227`de152788, bool blocking = false) [D:\winx64-debug\build\src\sql\sql_parse.cc @ 1406]
|
server!threadpool_process_request(class THD * thd = 0x00000227`de152788) [D:\winx64-debug\build\src\sql\threadpool_common.cc @ 394]
|
server!tp_callback(struct TP_connection * c = 0x00000227`ddf80e30) [D:\winx64-debug\build\src\sql\threadpool_common.cc @ 203]
|
server!tp_callback(struct _TP_CALLBACK_INSTANCE * instance = 0x00000063`731ef968, void * context = 0x00000227`ddf80e30) [D:\winx64-debug\build\src\sql\threadpool_win.cc @ 279]
|
server!io_completion_callback(struct _TP_CALLBACK_INSTANCE * instance = 0x00000063`731ef968, void * context = 0x00000227`ddf80e30, void * overlapped = 0x00000227`ddf80ea0, unsigned long io_result = 0, unsigned int64 nbytes = 0x11, struct _TP_IO * io = 0x00000227`de05a5e0) [D:\winx64-debug\build\src\sql\threadpool_win.cc @ 299]
|
I think that also 10.5 must be affected by this. On older server versions, it is probably harder to reproduce this, because we never observed MDEV-24302 on older versions than 10.5. This could correlate with scheduling improvements in 10.5 (MDEV-23855) and 10.6 (MDEV-20612, MDEV-25404); earlier, there could have been more scattered calls to write InnoDB log that would wake up RESET MASTER from its wait.
Attachments
Issue Links
- causes
-
MDEV-35197 Wrong locking order of LOCK_log and LOCK_global_system_variables, potential deadlock of server
- Closed
- is blocked by
-
MDEV-24302 RESET MASTER hangs as Innodb does not report on binlog checkpoint
- Closed
- is duplicated by
-
MDEV-34827 Various issues when using RESET MASTER and innodb_flush_log_at_timeout
- Closed
- relates to
-
MDEV-14462 Confusing error message: ib_logfiles are too small for innodb_thread_concurrency=0
- Closed
-
MDEV-25948 Remove log_flush_task
- Closed
-
MDEV-18959 Engine transaction recovery through persistent binlog
- Stalled
-
MDEV-24660 MYSQL_BIN_LOG::cleanup(): Assertion `b->xid_count == 0' failed in MYSQL_BIN_LOG::cleanup on SHUTDOWN
- Closed