[MXS-2884] MaxScale secondary master failover not working Created: 2020-02-10 Updated: 2020-08-27 Resolved: 2020-08-27 |
|
| Status: | Closed |
| Project: | MariaDB MaxScale |
| Component/s: | binlogrouter |
| Affects Version/s: | 2.4.6 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Assen Totin (Inactive) | Assignee: | Unassigned |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Description |
|
I've seen tickets on the topic before, but it remains unresolved, hence a new one. If the binlog router is going to be replaced, it still needs making sure that the replication slave in MaxScale is able to maintain secondary master failover. 3-node Galera cluster with MariaDB 10.4.12. MaxScale 2.4.6 configured as binlog slave to the first Galera node with the remaining two nodes Galera configured as secondary masters. The setup is configured to use GTID. Turning off the primary master on Galera results in MaxScale failing to continue the replication from a secondary one with the following error: Last_Error: Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged. The GTID is, of course, there - all three Galera nodes (or two, after the one we use for replication is shut down) report the same set of two GTIDs. Manually stopping the slave, setting the secondary master as primary one and starting the salve resumes the replication - but the whole purpose of secondary masters is to ensure transparent failover. Relevant part of Galera configuration:
MaxScale binlog slave configuration:
Sequence of commands on MaxScale binlog slave prior to turning off the first Galera node:
MaxScale report on Galera cluster status after the primary master was turned off (note that the GTIDs are there and sync'ed between the remaining two Galera nodes):
|
| Comments |
| Comment by markus makela [ 2020-08-27 ] |
|
If I'm reading this issue correctly, the 2.5 binlogrouter should work as long as the Galera GTIDs are handled correctly. This also means that we can't (or won't) fix this in MaxScale. |