Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Duplicate
-
10.1.19, 10.1.20
-
None
Description
I've simplified my situation for this bug description:
We've two independent MariaDB clusters, both are a regular master-slave setup (let's call the masters A and B). There's also a MariaDB as data warehouse, using multi-source replication from both masters. All replications were created in the pre-GTID era using binlog_file and binlog_pos. Of course, both masters already generated GTIDs for the default domain id 0.
When we migrated to a GTID based replication, we configured master A with domain id 1 and B with domain id 2. All slaves in group A have now 2 GTIDs in gtid_slave_pos: one with domain 1 with a increasing sequence counter, and one with a static sequence counter the former default domain 0. Master A also keeps track of this GTID of domain 0 via gtid_binlog_pos (and gtid_binlog_state).
For master B and its slaves the same applies for domain 2 and 0, respectively. So far this is not a problem.
However, it's not possible to introduce GTID based replication on the warehouse. The last statement written in the pre-GTID era for the default domain id 0 originated from master B and has a lower sequence number than the GTID for domain 0 on master A.
Therefore, when executing
CHANGE MASTER "A" TO master_use_gtid = slave_pos, do_domain_id = (1), ignore_domain_id = (); |
on the warehouse to allow its replication to use GTID, A attempts to scan the binlog not only for domain 1, but also for domain 0 (despite do_domain_id). Because the sequence number for domain 0 is lower than the one in A's gtid_binlog_pos, A refuses the connection with
Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-XXXX-YYYY, which is not in the master's binlog
|
There's no way to ditch knowledge about domain 0 on the masters of A and B except setting gtid_binlog_state which would cause a RESET MASTER and therefore isn't applicable in live operation.
I assume that this issue is similar to MDEV-9108 which (as far as I understood) basically wants that do_domain_id also tells the master to ignore all other domains when scanning the binlogs for the starting position. This would solve my issue.
But after all I believe that it's easier to allow altering gtid_binlog_pos on the master (not directly or via gtid_binlog_state, but through a function call) to forget GTIDs for a specific domain id without issuing RESET MASTER.
Attachments
Issue Links
- is duplicated by
-
MDEV-12012 gtid_domain_id doesn't work with multisource between 10.1 and 10.0 + GTID
- Closed
- relates to
-
MDEV-9108 "GTID not in master's binlog" error with {ignore|do}_domain_ids
- Open