[MXS-3971] Failover after switchover fails if no transaction is ran between them Created: 2022-01-30 Updated: 2022-09-26 Resolved: 2022-09-26 |
|
| Status: | Closed |
| Project: | MariaDB MaxScale |
| Component/s: | mariadbmon |
| Affects Version/s: | 6.2.1 |
| Fix Version/s: | 6.4.2 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Assen Totin (Inactive) | Assignee: | Esa Korhonen |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Sprint: | MXS-SPRINT-165, MXS-SPRINT-166 |
| Description |
|
I recently stumbled onto an intriguing MaxScale phenomenon. It is not perhaps exactly a bug, but there still seems to be a way to improve MaxScale and avoid what I experienced. My setup and flow were as follows:
Digging around this I found that if at least one transaction is ran through the newly promoted master after the switchover, then all GTID values are updated on all nodes and turning off this promoted master results in MaxScale successfully choosing one of the remaining slaves and making it a master. If, however, there was no transaction ran through the newly promoted master between the switchover and the disconnection of this promoted master, MaxScale finds itself unable to do a failover. It may be that the same effect could be achieved with two failovers or two switchovers - I have not tested. If the reason for the error is indeed the fact that some of the GTID variables will only be updated after at least one transaction is passed through the newly promoted master, then perhaps the easiest way to mitigate this is to make MaxScale do an artificial transaction after each failover or switchover as a part of the process (and then do another one to negate the first). |