[MDEV-27308] 3 problems encountered when node failure during Galera fragmented transaction running Created: 2021-12-19 Updated: 2022-06-15 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Galera, Galera SST |
| Affects Version/s: | 10.5.12 |
| Fix Version/s: | 10.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | William Wong | Assignee: | Alexey |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Redhat 7 on VMware |
||
| Attachments: |
|
| Description |
|
Hi, In our production env, Galera transaction fragment is used for running batch job. In some incidents (tmp directory full , VM reboot during hardware memory issue) , we encountered below 3 problems. Problem #1: SST triggered to recover failed node but IST is expected Workaround is manual restart node. But Galera should resume automatically on its own when hardware issue and running IST in most cases. Repeatable testcase (galera-donor-desync.txt) is attached |