Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.5.12
-
None
-
Redhat 7 on VMware
Description
Hi,
In our production env, Galera transaction fragment is used for running batch job. In some incidents (tmp directory full , VM reboot during hardware memory issue) , we encountered below 3 problems.
Problem #1: SST triggered to recover failed node but IST is expected
Problem #2: in some test, failed node encounters crash with signal 11 repeatedly until node 1 commit
Problem #3: local node state of donor node changed to "Donor/Desynced" unexpectedly after failed recovered
Workaround is manual restart node. But Galera should resume automatically on its own when hardware issue and running IST in most cases.
Repeatable testcase (galera-donor-desync.txt) is attached