[MDEV-31798] [ERROR] WSREP: FSM: no such a transition APPLYING -> COMMITTED Created: 2023-07-29  Updated: 2023-08-06

Status: Open
Project: MariaDB Server
Component/s: Galera, Server
Affects Version/s: 10.6.10, 10.6.14
Fix Version/s: None

Type: Bug Priority: Major
Reporter: William Wong Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Environment:

redhat 7 on VMware


Attachments: File node1.mariadb-error.log-20230727     File node1.mariadb.cnf     File node2.mariadb-error.log-20230727     File node2.mariadb.cnf    

 Description   

One of our DB is using 2 DB nodes + 1 aribtrator architecture with HAProxy on top. Original version is DB nodes MariaDB 10.6.10 and Galera 26.4.12.

We upgraded one DB node to MariaDB 10.6.14 (Galera 26.4.14) and switched HAPRoxy new traffic to this upgraded DB node. At this point, one DB node is 10.6.14 (Galera 26.4.14) and another DB node is 10.6.10 (Galera 26.4.12). Both nodes have connections from app. After 3 minutes, encountered FSM error in 10.6.14 node. The crashed node restarted automationally and can join back cluster. However, ~3 mins later, it will crash again. This behavior kept repeating.

[ERROR] WSREP: FSM: no such a transition APPLYING -> COMMITTED

We temporarily switched ALL DB traffic to 10.6.10 node and NO more crash on 10.6.14 node.

We worry if upgrade the remaining DB node to MariaDB 10.6.14. Both DB nodes will be crashed and cannot be started.

Kindly advise how to troubleshoot this case.

Attached the DB parameter file and log of both nodes.



 Comments   
Comment by William Wong [ 2023-08-06 ]

After further checking, the problem should be related to sequence object with setting "nocache". Increment parameter is 0 already.

We have another problem with both node 10.6.14. But since the test case is not the same (this case is one node 10.6.10 and one node 10.6.14), i will open another case for that.

Generated at Thu Feb 08 10:26:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.