Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.5.22
-
None
-
None
-
Debian Bullseye 11.8
Description
Hi,
Context: I operate a 4 node Galera Cluster (one node has pc.weight=0, 3 nodes are write-write (using haproxy for TCP loadblancing), one node has an asynchronous replication with a fifth mariadb use for long running reporting queries which do not require realtime data.
The cluster handles around one hundred requests per second.
The cluster was recently upgraded to 10.5.22 from mixed 10.5.17 and 10.5.19 versions.
The node with pc.weight=0 seems to always trigger cluster-wide deadlocks (not often, but twice since yesterday). While running 10.5.19, that node triggered the innodb_fatal_wait_semaphore_threshold, and killed itself, allowing the cluster to recover and handle queries again, while the affected node restarted and resync itself.
But since the 10.5.22, this watchdog is no longer triggered, and then it keeps waiting and logging every one or two seconds (the query has been edited), also notice the weird characters at the end of the log line :
2023-11-21 16:12:41 0 [Note] InnoDB: WSREP: BF lock wait long for trx:9211680232 query: INSERT INTO sometable (col1, col2, col3)
|
VALUES (0, 0, 1)<B6><C8>\e^S^V
|
This table has partitions and its primary key contains an auto_increment field along with the partition field.
The "SHOW ENGINE INNODB STATUS" command gives this info about that transaction :
---TRANSACTION 9211680232, ACTIVE 1291 sec inserting
|
mysql tables in use 50, locked 50
|
LOCK WAIT 8 lock struct(s), heap size 1128, 4 row lock(s)
|
MySQL thread id 2, OS thread handle 139698846578432, query id 70655269 Write_rows_log_event::write_row(1320157274)
|
INSERT INTO sometable (col1, col2, col3)
|
VALUES (0, 0, 1)
|
------- TRX HAS BEEN WAITING 1291 SEC FOR THIS LOCK TO BE GRANTED:
|
RECORD LOCKS space id 15926 page no 277573 n bits 112 index PRIMARY of table `somedb`.`sometable` /* Partition `p0` */ trx id 9211680232 lock_mode X locks gap before rec insert intention waiting
|
Sending a SIGTERM to mariadb won't stop the process, and the log line still repeats, but it allows the node to disconnect from the galera cluster, and the other node will handle writes.
I guess the "TRX HAS BEEN WAITING 1291 SEC" should trigger some watchdog, but I can't find which one, and I'm not able to tell if this is some kind of regression from 10.5.19 where the innodb_fatal_wait_semaphore_threshold was triggered.
I have found MDEV-30161 which is not implemented yet, I guess this could solve the current issue?
I also saw MDEV-28180 but my wsrep_slave_threads is already set to 1 on our nodes, and people there seems to say it solve the issue for them.
Reading the 10.5.23 release notes (released last week), I have not found any clue related to his behavior, but I might missed it.
Another plan on our list is to use proxysql instead of haproxy to direct writes on the same tables on the same node, to reduce cluster-wide conflicts.
I also can try to get stacktraces if the issue happens again, when the failing node will be re-added to the cluster and receive trafic.
Thanks