Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.5, 10.6, 10.2(EOL), 10.3(EOL), 10.4(EOL)
-
None
Description
The bug was originally observed as hanging binlog background thread at shutdown similar to one of MDEV-21120:
#14 MYSQL_BIN_LOG::stop_background_thread (this=0x55660e6b9ba0 <mysql_bin_log>) at /data/Server/10.6D/sql/log.cc:3411
|
#15 0x000055660af0ff8e in close_connections () at /data/Server/10.6D/sql/mysqld.cc:1720
|
#16 0x000055660af215bc in mysqld_main (argc=44, argv=<optimized out>) at /data/Server/10.6D/sql/mysqld.cc:5839
|
|
The hang suggested a missed unlogging of a xid or signal notification to the thread loss.
It turns out the former is the case.
MDEV-21117 commit reveals an in-born two defects in MYSQL_BIN_LOG::write_transaction_to_binlog 's loop that marks event groups
with the need of explicit xid unlogging:
(1) the loop never expected to start from already
reset ha_info (which is the one phase commit case that does not need the unlogging) as well as
(2) had a logical flaw
in its continuatio... condition to break after the first iteration snubbing any
further ha_info in the list even if they might represent commit_checkpoint_request incapable engines - which would meant to mark the group which may not have happen on the first iteration.
I set to fix starting from 10.2 though 10.6 is the most vulnerable due to (1) - the loop marks groups that should not be.
Thanks to elenst, alice and marko who helped to identify it!