Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
10.3.28
-
CentOS 7.9.2009
Defaults: mysqld would have been started with the following arguments:
--datadir=/data/mariadb --socket=/var/lib/mysql/mysql.sock --user=mysql --symbolic-links=0 --max_allowed_packet=16M --thread_cache_size=8 --max_connections=1550 --slow_query_log=1 --innodb_buffer_pool_size=49152M --innodb_log_file_size=2048M --innodb_log_buffer_size=16M --innodb_print_all_deadlocks=on --log-warnings=2 --plugin_load_add=auth_socket --default_storage_engine=InnoDB --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=1 --log_slave_updates=1 --log_bin=bin-log --server_id=5555 --binlog_format=ROW --innodb_autoinc_lock_mode=2 --expire_logs_days=1 --wsrep_provider=/usr/lib64/galera/libgalera_smm.so --wsrep_cluster_address=gcomm://[redacted] --wsrep_node_address=[redacted] --wsrep_sst_method=mariabackup --wsrep_sst_auth=[redacted] --wsrep_provider_options=evs.keepalive_period = PT1S; evs.inactive_check_period = PT1S; evs.suspect_timeout = PT5S; evs.inactive_timeout = PT15S; evs.install_timeout = PT15S; gcache.size=2G --wsrep_on=ON
CentOS 7.9.2009 Defaults: mysqld would have been started with the following arguments: --datadir=/data/mariadb --socket=/var/lib/mysql/mysql.sock --user=mysql --symbolic-links=0 --max_allowed_packet=16M --thread_cache_size=8 --max_connections=1550 --slow_query_log=1 --innodb_buffer_pool_size=49152M --innodb_log_file_size=2048M --innodb_log_buffer_size=16M --innodb_print_all_deadlocks=on --log-warnings=2 --plugin_load_add=auth_socket --default_storage_engine=InnoDB --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=1 --log_slave_updates=1 --log_bin=bin-log --server_id=5555 --binlog_format=ROW --innodb_autoinc_lock_mode=2 --expire_logs_days=1 --wsrep_provider=/usr/lib64/galera/libgalera_smm.so --wsrep_cluster_address=gcomm://[redacted] --wsrep_node_address=[redacted] --wsrep_sst_method=mariabackup --wsrep_sst_auth=[redacted] --wsrep_provider_options=evs.keepalive_period = PT1S; evs.inactive_check_period = PT1S; evs.suspect_timeout = PT5S; evs.inactive_timeout = PT15S; evs.install_timeout = PT15S; gcache.size=2G --wsrep_on=ON
Description
We had been running MariaDB 10.3.27 with Galera cluster in production without any issues.
Less than 26 hours after updating to 10.3.28 one of three servers stopped responding.
The following errors were logged for more than 800 seconds until I restarted MariaDB:
2021-03-10 15:10:45 0 [Warning] InnoDB: A long semaphore wait:
|
--Thread 140231647176448 has waited at lock0lock.cc line 3882 for 241.00 seconds the semaphore:
|
Mutex at 0x5562730bb380, Mutex LOCK_SYS created lock0lock.cc:461, lock var 2
|
2021-03-10 15:10:45 0 [Warning] InnoDB: A long semaphore wait:
|
--Thread 140286587229952 has waited at lock0lock.cc line 3882 for 241.00 seconds the semaphore:
|
Mutex at 0x5562730bb380, Mutex LOCK_SYS created lock0lock.cc:461, lock var 2
|
[clip]
|
2021-03-10 15:10:45 0 [Note] InnoDB: A semaphore wait:
|
--Thread 140232309577472 has waited at row0row.cc line 1133 for 239.00 seconds the semaphore:
|
X-lock on RW-latch at 0x7f8f0086a270 created in file buf0buf.cc line 1563
|
a writer (thread id 140232091498240) has reserved it in mode exclusive
|
number of readers 0, waiters flag 1, lock_word: 0
|
Last time write locked in file row0row.cc line 1133
|
[clip]
|
2021-03-10 15:10:45 0 [Note] InnoDB: A semaphore wait:
|
--Thread 140231088736000 has waited at lock0lock.cc line 3882 for 238.00 seconds the semaphore:
|
Mutex at 0x5562730bb380, Mutex LOCK_SYS created lock0lock.cc:461, lock var 2
|
2021-03-10 15:10:45 0 [Note] InnoDB: A semaphore wait:
|
--Thread 140232326362880 has waited at srv0srv.cc line 2026 for 238.00 seconds the semaphore:
|
X-lock (wait_ex) on RW-latch at 0x55627310f3e0 created in file dict0dict.cc line 920
|
a writer (thread id 140232326362880) has reserved it in mode wait exclusive
|
number of readers 4, waiters flag 1, lock_word: fffffffc
|
Last time write locked in file srv0srv.cc line 2026
|
2021-03-10 15:10:45 0 [Note] InnoDB: A semaphore wait:
|
--Thread 140231088121600 has waited at lock0lock.cc line 3882 for 238.00 seconds the semaphore:
|
Mutex at 0x5562730bb380, Mutex LOCK_SYS created lock0lock.cc:461, lock var 2
|
2021-03-10 15:10:45 0 [Note] InnoDB: A semaphore wait:
|
--Thread 140231087814400 has waited at lock0lock.cc line 3882 for 238.00 seconds the semaphore:
|
Mutex at 0x5562730bb380, Mutex LOCK_SYS created lock0lock.cc:461, lock var 2
|
I'll attach the full log from when the problem started. The log also contains InnoDB Monitor output.
This sounds similar to MDEV-24375, but seems to be a different issue since MDEV-24275 should have fixed that in 10.3.28 if I got things right.
Attachments
Issue Links
- is duplicated by
-
MDEV-25190 Semaphore wait has lasted > 600 seconds; stuck on bg_wsrep_kill_trx
- Closed
- relates to
-
MDEV-24275 InnoDB persistent stats analyze forces full scan forcing lock crash
- Closed
-
MDEV-24606 InnoDB: Semaphore wait has lasted > 600 second
- Closed
-
MDEV-24831 Galera test failure because of safe_mutex error
- Closed
-
MDEV-25114 Crash: WSREP: invalid state ROLLED_BACK (FATAL)
- Closed