Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.6.7
-
None
-
None
-
redhat x86-64 on vmware
Description
our galera cluster is 3 nodes configration (2 db nodes + 1 arbitrator). 2 days ago, one db node is down due to hardware issue. The remaining db node and arbitrator got split brain and db service down.
Checked from log, remaining nodes do not have message of each other until the dead node is confirmed down. There is around 10s time. We don't know why the good nodes do not declare each other stable in this 10s.
Kindly advise the directory to troubleshoot the problem.
Only 2 galera timeout are set while other timeout settings are still default values.
gmcast.peer_timeout=PT10S;
evs.suspect_timeout=PT12S;
DB configuration file and error log of each node are attached