Details
-
Bug
-
Status: Stalled (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
-
None
Description
In between test case 3 and test case 4 of rpl_change_master_demote.test, ER_GTID_POSITION_NOT_FOUND_IN_BINLOG2 is sent to the slave because of a race where the slave requests to start at a GTID not in the master's binlogs, and the master binlogs a transaction. This puts the master gtid_binlog_position ahead of the requested slave_connect_state.
The resulting failure looks like:
rpl.rpl_change_master_demote 'mix' w1 [ fail ]
|
Test ended at 2024-01-24 10:39:40
|
|
CURRENT_TEST: rpl.rpl_change_master_demote
|
mysqltest: In included file "./include/sync_with_master_gtid.inc":
|
included from /home/buildbot/buildbot/build/mariadb-10.11.7/mysql-test/suite/rpl/t/rpl_change_master_demote.test at line 149:
|
At line 48: Failed to sync with master
|
|
The result from queries just before the failure was:
|
< snip >
|
# * True primary is back to connection 'master'
|
# * True replica is back to connection 'slave'
|
##############################################
|
connection master;
|
connection slave;
|
CHANGE MASTER TO master_host='127.0.0.1', master_port=MASTER_PORT, master_user='root', master_use_gtid=slave_pos, master_demote_to_slave=1;
|
#
|
# Test Case 4: If gtid_slave_pos and gtid_binlog_pos are equivalent,
|
# MASTER_DEMOTE_TO_SLAVE=1 will not change gtid_slave_pos.
|
#
|
connection master;
|
# update gtid_binlog_pos and demote it (we have proven this works)
|
INSERT INTO t1 VALUES (3);
|
# Update to account for statements to verify replication in include file
|
CHANGE MASTER TO master_host='127.0.0.1', master_port=SLAVE_PORT, master_user='root', master_use_gtid=slave_pos, master_demote_to_slave=1;
|
RESET SLAVE ALL;
|
include/save_master_gtid.inc
|
connection slave;
|
include/sync_with_master_gtid.inc
|
Timeout in master_gtid_wait('0-1-10', 120), current slave GTID position is: 0-2-9.
|
|
More results from queries before failure can be found in /dev/shm/var/1/log/rpl_change_master_demote.log
|
|
- saving '/dev/shm/var/1/log/rpl.rpl_change_master_demote-mix/' to '/dev/shm/var/log/rpl.rpl_change_master_demote-mix/'
|
|
Retrying test rpl.rpl_change_master_demote, attempt(2/2)...
|
Attachments
Issue Links
- is part of
-
MDEV-33073 always green buildbot
-
- Stalled
-
-
MDEV-36647 No red leaves in the forest
-
- Open
-
- relates to
-
MDEV-34554 rpl_change_master_demote sporadically fails on buildbot
-
- Closed
-
-
MDEV-29517 rpl.rpl_change_master_demote sporadically fails in BB
-
- Closed
-
This test can also fail with
rpl.rpl_change_master_demote 'mix' w64 [ fail ]
Test ended at 2024-05-20 09:17:08
CURRENT_TEST: rpl.rpl_change_master_demote
mysqltest: At line 292: "IO thread should not be running after START SLAVE UNTIL master_gtid_pos using a pre-existing GTID"
The result from queries just before the failure was:
< snip >
SELECT VARIABLE_NAME, GLOBAL_VALUE FROM INFORMATION_SCHEMA.SYSTEM_VARIABLES WHERE VARIABLE_NAME LIKE 'gtid_binlog_pos' OR VARIABLE_NAME LIKE 'gtid_slave_pos' ORDER BY VARIABLE_NAME ASC;
VARIABLE_NAME GLOBAL_VALUE
GTID_BINLOG_POS 0-1-26,1-3-4,2-1-3,3-1-2,4-3-2
GTID_SLAVE_POS 0-2-24,1-3-4,2-1-3,3-1-2,4-3-2
CHANGE MASTER TO master_host='127.0.0.1', master_port=SLAVE_PORT, master_user='root', master_use_gtid=Slave_Pos, master_demote_to_slave=1;
SELECT VARIABLE_NAME, GLOBAL_VALUE FROM INFORMATION_SCHEMA.SYSTEM_VARIABLES WHERE VARIABLE_NAME LIKE 'gtid_binlog_pos' OR VARIABLE_NAME LIKE 'gtid_slave_pos' ORDER BY VARIABLE_NAME ASC;
VARIABLE_NAME GLOBAL_VALUE
GTID_BINLOG_POS 0-1-26,1-3-4,2-1-3,3-1-2,4-3-2
GTID_SLAVE_POS 0-1-26,1-3-4,2-1-3,3-1-2,4-3-2
# GTID ssu_middle_binlog_pos should be considered in the past because
# gtid_slave_pos should be updated using the latest binlog gtids.
# The following call to sync_with_master_gtid.inc uses the latest
# binlog position and should still succeed despite the SSU stop
# position pointing to a previous event (because
# master_demote_to_slave=1 merges gtid_binlog_pos into gtid_slave_pos).
START SLAVE UNTIL master_gtid_pos="ssu_middle_binlog_pos";
Warnings:
Note 1278 It is recommended to use --skip-slave-start when doing step-by-step replication with START SLAVE UNTIL; otherwise, you will get problems if you get an unexpected slave's mariadbd restart
# Slave needs time to start and stop automatically
# Validating neither SQL nor IO threads are running..