Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.3.28, 10.2, 10.4, 10.5
-
CentOS 7.9.2009
mysqld would have been started with the following arguments:
--datadir=/data/mariadb --socket=/var/lib/mysql/mysql.sock --user=mysql --symbolic-links=0 --max_allowed_packet=16M --thread_cache_size=8 --max_connections=1500 --slow_query_log=1 --log_error=/data/mariadb/mariadb.log --innodb_buffer_pool_size=49152M --innodb_log_file_size=2048M --innodb_log_buffer_size=16M --innodb_print_all_deadlocks=on --log-warnings=2 --plugin_load_add=auth_socket --default_storage_engine=InnoDB --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=1 --log_slave_updates=1 --log_bin=bin-log --server_id=6666 --binlog_format=ROW --innodb_autoinc_lock_mode=2 --expire_logs_days=1 --wsrep_provider=/usr/lib64/galera/libgalera_smm.so --wsrep_cluster_address=gcomm://[redacted] --wsrep_node_address=[redacted] --wsrep_sst_method=mariabackup --wsrep_sst_auth=[redacted] --wsrep_provider_options=evs.keepalive_period = PT1S; evs.inactive_check_period = PT1S; evs.suspect_timeout = PT5S; evs.inactive_timeout = PT15S; evs.install_timeout = PT15S; gcache.size=2G --wsrep_on=ON --wsrep_log_conflicts=1CentOS 7.9.2009 mysqld would have been started with the following arguments: --datadir=/data/mariadb --socket=/var/lib/mysql/mysql.sock --user=mysql --symbolic-links=0 --max_allowed_packet=16M --thread_cache_size=8 --max_connections=1500 --slow_query_log=1 --log_error=/data/mariadb/mariadb.log --innodb_buffer_pool_size=49152M --innodb_log_file_size=2048M --innodb_log_buffer_size=16M --innodb_print_all_deadlocks=on --log-warnings=2 --plugin_load_add=auth_socket --default_storage_engine=InnoDB --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=1 --log_slave_updates=1 --log_bin=bin-log --server_id=6666 --binlog_format=ROW --innodb_autoinc_lock_mode=2 --expire_logs_days=1 --wsrep_provider=/usr/lib64/galera/libgalera_smm.so --wsrep_cluster_address=gcomm://[redacted] --wsrep_node_address=[redacted] --wsrep_sst_method=mariabackup --wsrep_sst_auth=[redacted] --wsrep_provider_options=evs.keepalive_period = PT1S; evs.inactive_check_period = PT1S; evs.suspect_timeout = PT5S; evs.inactive_timeout = PT15S; evs.install_timeout = PT15S; gcache.size=2G --wsrep_on=ON --wsrep_log_conflicts=1
Description
About 29 hours after updating a previously very stable 3 server MariaDB Galera cluster from 10.3.27 to 10.3.28, one of the nodes crashed with the following message:
2021-03-10 18:22:48 0 [ERROR] WSREP: invalid state ROLLED_BACK (FATAL)
|
at /home/buildbot/buildbot/build/galera/src/replicator_smm.cpp:abort_trx():735
|
2021-03-10 18:22:48 0 [ERROR] WSREP: cancel commit bad exit: 7 33792346039
|
210310 18:22:48 [ERROR] mysqld got signal 6 ;
|
Attached is a bit more log (coincidentally had been running with conflict logging enabled), but I don't think there's much more I can provide right now. I'm still logging this since before the update to 10.3.28 the cluster had been running very stable for months, and this could be somehow related to MDEV-25111, which we also encountered first time right after updating to 10.3.28.
Attachments
Issue Links
- blocks
-
MDEV-25368 Galera cluster hangs on Freeing items
- Closed
- causes
-
MDEV-24294 MariaDB - Cluster freezes if node hangs
- Closed
-
MDEV-25992 Galera 3 Server crash with signal 6 after RBR event apply failed
- Closed
-
MDEV-26099 MariaDB 10.5.10/10.5.11 Galera assertion crash
- Closed
- is caused by
-
MDEV-23328 Server hang due to Galera lock conflict resolution
- Closed
- is duplicated by
-
MDEV-28604 MariaDB crashing very often
- Closed
- relates to
-
MDEV-24915 Galera conflict resolution is unnecessarily complex
- Closed
-
MDEV-25410 Assertion `state_ == s_exec' failed - mysqld got signal 6
- Closed
-
MDEV-25609 Signal 11 on wsrep_mysqld.cc:2620
- Closed
-
MDEV-24397 MariaDB Galera Server Crashes on Large DELETE
- Closed
-
MDEV-25111 Long semaphore wait (> 800 secs), server stops responding
- Closed