Details
Description
*** This is a refined description, the original one follows it ***
When mysqldump run on mysql system database it generates inserts sql commands
into mysql.gtid_slave_pos.
There're a couple of issues with that fact when the script is going to provision
a new slave instance. That is when mysqldump runs with --dump-slave
1. without --gtid option the script misses out
SET @@global.gtid_slave_pos = <value>
In effect the script, that is to run after the new slave server has been initialized, can not update internal state of slave server objects incl.
rpl_global_gtid_slave_state.last_sub_id
When replication begins this value may start off 1 or be derived from a
pre-existing record of the table to eventually reach
a possibly greater inserted value of
mysql.gtid_slave_pos.sub_id
column which is a reason of a duplicate key error described specifically in a scenario of the original description below.
2. even with --gtid the SET statement is placed in the script before the mysql.gtid_slave_pos inserts block, so an accumulative effect of replying
the two on the new slave is similar to p.1 (SET would indeed act as to create a pre-existing record).
While both effects can also be observed if a "genuine" strongly unrecommended manual insert is done into the table, to my view that merits a separate work in MDEV-34564.
Clearly a mere relocating of
SET @@global.gtid_slave_pos = <value>
onto a position after the insert block sorts out the 2nd mysqldump issue. To tackle the 1st one apparently requires more efforts that it deemed and thus is deferred to MDEV-34564 Specifically It is not sufficient to chose <value> to be @@global.gtid_slave_pos itself as such measure would purge the old content of table that is to risk losing a pre-existing state.
The aimed output therefore should be like in the following block:
LOCK TABLES `gtid_slave_pos` WRITE;
|
/*!40000 ALTER TABLE `gtid_slave_pos` DISABLE KEYS */; |
INSERT INTO `gtid_slave_pos` VALUES |
(0,66,1,1),
|
...
|
(0,85,1,20)
|
/*!40000 ALTER TABLE `gtid_slave_pos` ENABLE KEYS */; |
UNLOCK TABLES;
|
 |
-- without ---gtid
|
-- SET GLOBAL gtid_slave_pos = @@global.gtid_slave_pos;
|
SET GLOBAL gtid_slave_pos='0-1-20'; |
The last SET statement ensures that the mysqldump script sets
rpl_global_gtid_slave_state.last_sub_id
to the max of mysql.gtid_slave_pos.sub_id in the table and by that rules out duplicate key error possibility.
*** The original description ***
When restoring a master backup where, for whatever reason, gtid_slave_pos contains entries (master<->master setup, intermediate slave, former slave promoted to master), and using classic master_log_file/pos to set up replication, a newly set up slave will run for a while, counting up its on "sub_id" from one, until reaching the masters "sub_id" values still present in the restored gtid_slave_pos. At that point replication will stop with a duplicate key error.
How to reproduce:
- set up a master, insert a "fake" entry into gtid_slave_pos:
INSERT INTO mysql.gtid_slave_pos VALUES(0,5,1,23);
|
- take a master backup
mysqldump --all-databases --single-transaction --master-data > dump.sql
|
- set up slave, set up replication basics:
CHANGE MASTER TO MASTER_HOST='...'
|
, MASTER_USER='...'
|
, MASTER_PASSWORD='...';
|
- restore dump on slave
- start slave, check slave status to see that things are running, check mysql.gtid_slave_pos contents. Note the first line inserted by the slave, and the 2nd one originating from the master still being there:
+-----------+--------+-----------+--------+
|
| domain_id | sub_id | server_id | seq_no |
|
+-----------+--------+-----------+--------+
|
| 0 | 1 | 2 | 5 |
|
| 0 | 5 | 1 | 23 |
|
+-----------+--------+-----------+--------+
|
- perform some DDL or DML statements on the master
- check slave status:
Last_Error: Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1062: Duplicate entry '0-5' for key 'PRIMARY'
|
The gtid_slave_pos table is truncated when setting the gtid_slave_pos variable with SET GLOBAL, but is left as is on
CHANGE MASTER TO MASTER_LOG_FILE='...', MASTER_LOG_POS=...;
|
Proposed fixes:
TRUNCATE the table on explicit CHANGE MASTER TO MASTER_LOG_FILE, and move the CHANGE MASTER TO command from beginning to end of mysqldump output
Attachments
Issue Links
- causes
-
MDEV-35098 rpl.rpl_mysqldump_gtid_slave_pos fails in buildbot
- Open
- relates to
-
MDEV-18828 Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1062: Duplicate entry '0-3913850' for key 'PRIMARY'
- Closed
-
MDEV-34615 mysqldump wipes off pre-existing gtid slave state
- Stalled