[MDEV-27130] Node of Galera Cluster with 5 Members freezes after directing traffic of it - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Cannot Reproduce
Affects Version/s: 10.5.13
Fix Version/s: N/A
Component/s: Galera, Platform FreeBSD, Storage Engine - InnoDB
Labels:
None
Environment:
MariaDB 10.5.13 , galera provider 26.4.10 ,FreeBSD 12 and 13 . ZFS storage, 1500G RAM , 64 or 96 Cores

Description

Hello all,

We updated our MariaDB Galera Cluster to 10.5.13 last week. Since then we are facing following issue each time when try to switch the "master" node.

When we switch the traffic from one node to other at the time of medium loaded service - 15-20000 q/s, the new node freezes in brutal way.

The wsrep status stays as it is normal member of the cluster - Synced with 5 IPs listed, but other members exclude it from the quorum.

The log is filled in infinitive loop with following messages:

InnoDB: WSREP: BF lock wait long for trx:11255701331 query: INSERT INTO

In the log are repeated the same 7-10 unique INSERTS.

The whole cluster freezes until we shutdown the mysqld on "bad" node with regular service shutdown - /usr/local/etc/rc.d/mysql-server onestop.

When we try to stop the Node with regular shutdown procedure the node is excluded from the cluster and service operations continue as normal. But the mysqld is going to print
in infinitive loop the queries noted above. The only way to stop the mysqld working on "bad" node is with kill -9.

The specific thing here is the query ( INSERT ) and target table are always the same. We have massive INSERT load to this table and one daemon which process the data on background. If we stop the daemon before node switching there are no issues.

Cheers
Rumen

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

etc_mysql.tar
28 kB
2022-01-25 07:31

Activity

Ascending order - Click to sort in descending order

View 7 older comments

Mathieu CAUSERO added a comment - 2022-07-27 11:21

Hello there,

We do have the same issue happening on a 3-nodes MariaDB Galera Cluster running 10.4.24 on Debian 10.

The SQL workload is fully load balanced between the 3 nodes, the average metrics during workday are (on each node) :

selects : ~4000/s
updates : ~200/s
inserts : ~80/s
deletes : < 1/s

The issue mostly happens with 2 or 3 INSERT transactions on the same table (with different values). The table is about 43M rows with 31GB of datas and 20GB of index.

When it happens, a graceful stop of "bad" node does unblock the whole cluster but it always needs some SIGKILL to be able to restart the service. Usually the restarted node does sync with IST, sometimes it needs a SST (whereas the configured 1 GB cache should cover about 3 hours of downtime...).

We keep track of each incident with full processlist on JSON format + mysqld log (with "WSREP: BF lock wait long for" messages). How can we privately send you these informations that might be useful to troubleshoot the issue ?

Mathieu CAUSERO added a comment - 2022-07-27 11:21 Hello there, We do have the same issue happening on a 3-nodes MariaDB Galera Cluster running 10.4.24 on Debian 10. The SQL workload is fully load balanced between the 3 nodes, the average metrics during workday are (on each node) : selects : ~4000/s updates : ~200/s inserts : ~80/s deletes : < 1/s The issue mostly happens with 2 or 3 INSERT transactions on the same table (with different values). The table is about 43M rows with 31GB of datas and 20GB of index. When it happens, a graceful stop of "bad" node does unblock the whole cluster but it always needs some SIGKILL to be able to restart the service. Usually the restarted node does sync with IST, sometimes it needs a SST (whereas the configured 1 GB cache should cover about 3 hours of downtime...). We keep track of each incident with full processlist on JSON format + mysqld log (with "WSREP: BF lock wait long for" messages). How can we privately send you these informations that might be useful to troubleshoot the issue ?

Valerii Kravchuk added a comment - 2022-08-04 04:51

Hi Mathieu CAUSERO,

Backtrace of all threads taken with gdb when node hangs would be a good start. I think it can be shared in public as it does not contain any really sensitive information (please, review and check if in doubts before attaching).

Valerii Kravchuk added a comment - 2022-08-04 04:51 Hi Mathieu CAUSERO, Backtrace of all threads taken with gdb when node hangs would be a good start. I think it can be shared in public as it does not contain any really sensitive information (please, review and check if in doubts before attaching).

Jan Lindström (Inactive) added a comment - 2022-08-04 07:30

mathieucausero I have few questions. Do you have primary key on all tables affected? Do you have foreign keys between tables affected? Do you use auto increment as primary key column and if you do, do you set different offset for every node?

Jan Lindström (Inactive) added a comment - 2022-08-04 07:30 mathieucausero I have few questions. Do you have primary key on all tables affected? Do you have foreign keys between tables affected? Do you use auto increment as primary key column and if you do, do you set different offset for every node?

Mathieu CAUSERO added a comment - 2022-08-04 07:53 - edited

@Jan Lindström to answer your few questions :

primary key : the table mainly involved by these freezes has a composed primary key (for an unkown reason) -> PRIMARY KEY (`IDappels`,`identifiantProprietaire`)
no foreign key on these tables
auto increment is set on the primary key (well on the previous example on IDappels)

From what I see on our customer's database, every impacted table (= every table that have had requests involved in cluster deadlock) is made like this.

Would that composed PK be an issue ?

To add a bit more context, the most impacted table is made of 68M rows with 36GB of datas. It's taken dozens of INSERT per second from 3 galera nodes (all are used as writers). Servers are all baremetal servers with 128GB of RAM (InnoDB buffer pool is bigger than the datas+index size), NVMe storage, 2 Gbit/s private network with MTU 9000.

Mathieu CAUSERO added a comment - 2022-08-04 07:53 - edited @Jan Lindström to answer your few questions : primary key : the table mainly involved by these freezes has a composed primary key (for an unkown reason) -> PRIMARY KEY (`IDappels`,`identifiantProprietaire`) no foreign key on these tables auto increment is set on the primary key (well on the previous example on IDappels) From what I see on our customer's database, every impacted table (= every table that have had requests involved in cluster deadlock) is made like this. Would that composed PK be an issue ? To add a bit more context, the most impacted table is made of 68M rows with 36GB of datas. It's taken dozens of INSERT per second from 3 galera nodes (all are used as writers). Servers are all baremetal servers with 128GB of RAM (InnoDB buffer pool is bigger than the datas+index size), NVMe storage, 2 Gbit/s private network with MTU 9000.

Jan Lindström added a comment - 2023-05-15 08:12

There has been significant fixes on Galera conflict resolution and multi-master execution. I suggest trying more recent version of MariaDB and Galera library.

Jan Lindström added a comment - 2023-05-15 08:12 There has been significant fixes on Galera conflict resolution and multi-master execution. I suggest trying more recent version of MariaDB and Galera library.

MariaDB Server

Node of Galera Cluster with 5 Members freezes after directing traffic of it

Details

Description

Attachments

Attachments

Activity

People

Dates

Git Integration