[MDEV-28180] BF lock wait long for trx Created: 2022-03-28  Updated: 2024-01-04

Status: Open
Project: MariaDB Server
Component/s: Galera, Storage Engine - InnoDB
Affects Version/s: 10.6.5, 10.3.34, 10.8.5
Fix Version/s: 10.6

Type: Bug Priority: Major
Reporter: Khai Ping Assignee: Seppo Jaakola
Resolution: Unresolved Votes: 4
Labels: None
Environment:

3 node Galera multi-master cluster.

MariaDB 10.6.5 and Galera 26.4.9



 Description   

We are facing random and intermittently issue with our 3 node galera multi-master cluster since 10.2.

This MDEV-24915 seems to resolve many of the locking issues. However it still happens in 10.6.

When this BF lock issue happens, the affected node simply block/lock out the whole cluster and none of our clients can read/write.

The only way to get out of it is to kill the affected node and let it join via IST.

2022-03-17 20:08:27 6 [Note] InnoDB: WSREP: BF lock wait long for trx:0x1a62643 query: UPDATE queue
        SET status='working',
            host_uid='node-5',
            working_timestamp=UNIX_TIMESTAMP(),
            id=LAST_INSERT_ID(id)
        WHERE status='queued' AND (consumer_type=0 OR (7 & consumer_type = consumer_type)) AND (job_type_id IN (9, 10, 12, 15, 16, 18, 19, 20, 32, 24, 25, 28, 29, 30, 31, 33)) AND ((job_type_id IN (13) AND volume IN (4)) OR (job_type_id NOT IN (13))) AND ((job_type_id IN (14, 15, 26, 31) AND volume IN (4)) OR (job_type_id NOT IN (14, 15, 26, 31))) ORDER BY priority, id ASC LIMIT 1
2022-03-17 20:09:17 2 [Note] InnoDB: WSREP: BF lock wait long for trx:0x1a62644 query: UPDATE queue
        SET status='working',
            host_uid='node-1',
            working_timestamp=UNIX_TIMESTAMP(),
            id=LAST_INSERT_ID(id)
        WHERE status='queued' AND (consumer_type=0 OR (7 & consumer_type = consumer_type)) AND (job_type_id IN (9, 10, 12, 15, 16, 18, 19, 20, 32, 24, 25, 28, 29, 30, 31, 33)) AND ((job_type_id IN (13) AND volume IN (1)) OR (job_type_id NOT IN (13))) AND ((job_type_id IN (14, 15, 26, 31) AND volume IN (1)) OR (job_type_id NOT IN (14, 15, 26, 31))) ORDER BY priority, id ASC LIMIT 1

These errors keep looping itself.

Is this a known issue?



 Comments   
Comment by Karl Dane [ 2022-05-04 ]

We're encountering the same issue with almost the same setup: 3-node galera multi-master. , running MariaDB 10.3.34, Galera 25.3.35.

Everything will be running fine for hours or days, and then one node will get into a state, dragging down the rest of the cluster. Logs fill up with hundreds of:

[Note] InnoDB: WSREP: BF lock wait long for trx:2107613546 query: INSERT INTO ...', `mech_cat_id` = '2', `mech_id` = '2', `name` = 'step-normal-reveal'ާrb#023,

Cluster doesn't recover until the affected node is killed.

Comment by Khai Ping [ 2022-05-09 ]

we resolve it by reverting the wsrep_slave_threads from 2 to 1.

Comment by Luke Cousins [ 2022-10-20 ]

We're getting this issue almost weekly on our 3 node cluster running 10.8.5.

There's nothing in the logs before a barrage of `InnoDB: WSREP: BF lock wait long for trx:0xdc38a55 query:...` messages and to clear the issue the node needs to be force killed, have its data dir cleared and a full SST. IST fails. Until we kill the node the entire cluster is locked up.

What can we do to help get this fixed? How can we get information to help you fix it? Thanks.

Comment by Kin [ 2023-04-19 ]

Thanks @Khai Ping, it appears that wsrep_slave_threads was set with default value 4 in our Bitnami mariadb galera helmchart.
This while the pods have a resource request/limit of 500 mCPU causing the nodes have wsrep related errors.
But to my understanding, this value should match the amount of available CPU core, so I gave the nodes a CPU request/limit of 4 CPU and now the cluster runs without issues .

Comment by Khai Ping [ 2023-04-19 ]

@kin, do you mean you also faced the same issue as us and you resolved it by setting wsrep_slave_threads to 1 too?

Comment by Kin [ 2023-04-19 ]

@Khai Ping, yes. My wsrep_slave_threads had a value of 4 while my pod has a CPU request/limit of 500 mCPU. This causes one or more nodes to have wsrep issues like "BF lock wait long" , "WSREP: BF applier failed to open_and_lock_tables:" which leads to the nodes to be not in quorum.

I first tried to test with wsrep_slave_threads=1 which dispite with 500 mCPU it runs stable. Then I have set my CPU request/limit to 3 CPU for my pod spec and set wsrep_slave_threads=3. I am running and testing this config for two days and it runs stable.

All this is based on the official documentation on "wsrep_slave_threads" where it is stated that it should be the nr. of CPU cores.

Comment by Khai Ping [ 2023-04-19 ]

@kin, ic, thats great news to hear that it resolve it for you.

Comment by Kin [ 2023-05-17 ]

@Khai Ping, unfortunately a write conflict occurred after running it for two weeks without issues. Gonna set it back to 1 and see if this wil run stable for a longer period.

Generated at Thu Feb 08 09:58:42 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.