Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.1.11
-
None
-
CentOS Linux release 7.2.1511 (Core), 64-bit
Description
We had a mysqld crash which appears to be during binlog rotation:
160224 9:51:59 [ERROR] mysqld got signal 11 ;
|
...
|
stack_bottom = 0x7fc43aa2d8d0 thread_stack 0x48400
|
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7fc541c0ccce]
|
/usr/sbin/mysqld(handle_fatal_signal+0x38d)[0x7fc54173a49d]
|
/lib64/libpthread.so.0(+0xf100)[0x7fc540d5a100]
|
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG21do_checkpoint_requestEm+0x9d)[0x7fc5417fb90d]
|
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG20checkpoint_and_purgeEm+0x11)[0x7fc5417fb9a1]
|
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG23trx_group_commit_leaderEPNS_18group_commit_entryE+0x50a)[0x7fc5417fe95a]
|
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG34write_transaction_to_binlog_eventsEPNS_18group_commit_entryE+0x93)[0x7fc5417fec63]
|
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG27write_transaction_to_binlogEP3THDP17binlog_cache_mngrP9Log_eventbbb+0xcf)[0x7fc5417fef7f]
|
/usr/sbin/mysqld(+0x6740f8)[0x7fc5417ff0f8]
|
/usr/sbin/mysqld(+0x67465c)[0x7fc5417ff65c]
|
/usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13log_and_orderEP3THDybbb+0x44)[0x7fc5417ff6f4]
|
/usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x482)[0x7fc54173d782]
|
/usr/sbin/mysqld(_Z12trans_commitP3THD+0x5b)[0x7fc54169753b]
|
/usr/sbin/mysqld(_ZN13Xid_log_event14do_apply_eventEP14rpl_group_info+0xba)[0x7fc541807f2a]
|
/usr/sbin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14rpl_group_infoP19rpl_parallel_thread+0x1e1)[0x7fc541545441]
|
/usr/sbin/mysqld(handle_slave_sql+0x261b)[0x7fc541548e7b]
|
/lib64/libpthread.so.0(+0x7dc5)[0x7fc540d52dc5]
|
/lib64/libc.so.6(clone+0x6d)[0x7fc53f17628d]
|
(full log is attached in crashlog.txt as it appeared in /var/log/messages.
From a cursory investigation, these observations may be relevant:
- The crashing mysqld instance was part of a 2-node galera cluster. (The cluster was under construction, and a 3rd node was due to join it later).
- A new file with a timestamp of the crash appeared in /var/lib/mysql: 0x009fe611_data.000000. The contents appears to be binary of some sort, and it contains what appears to be schema/tablenames towards the beginning. I suspect the contents may aid diagnostics
- The node was running inbound replication - no other activity on the node yet (or elsewhere in the cluster)
- Replication is "traditional" - i.e. not using GTID
- Parallel replication is not configured - i.e. defaults to single thread
- It appears to have happened during (or immediately after) binlog rotation: A new bin log file had been created - 405 bytes long, and with no transactions yet according to mysqlbinlog. The timestamp is in the same second as the crash.
- Other nodes (elsewhere in our estate) replicating off the same master did not encounter any problems