Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.0.26
-
linux
Description
When doing STOP SLAVE SQL_THREAD and START SLAVE SQL_THREAD repeatedly parallel replication sometimes fails on START with a duplicate key error. SHOW SLAVE STATUS then always shows a relay log position at the beginning of a log file, e.g. "Relay_Log_Pos: 4". This only seems to happen if a large transaction is split across more than one relay log file.
How to reproduce:
Master config:
[mysqld]
|
server-id=1
|
log-bin=master-bin
|
binlog-format=statement
|
Slave config:
[mysqld]
|
server-id=2
|
slave_parallel_threads=20
|
max_relay_log_size=98304
|
Create replication user on the master:
CREATE USER 'repl'@'%' IDENTIFIED BY 'password';
|
GRANT ALL PRIVILEGES ON *.* TO 'repl'@'%';
|
RESET MASTER;
|
Start slave:
SET GLOBAL gtid_slave_pos='0-1-2';
|
CHANGE MASTER TO Master_host='mymasterip', Master_user='repl', Master_password='password', master_use_gtid=slave_pos;
|
START SLAVE;
|
On the master: create a simple PHP script "test.php" to populate a table using given domain ID (modulo 3) and table name in transactions of 1000 rows each (transactions will take about 40K binlog space each, so talking about half of the max_relaylog_size which makes hitting a transaction split very likely)
<?php
|
|
$domain_id = $argv[1]%3 + 1;
|
$table = $argv[2];
|
|
echo "SET gtid_domain_id=$domain_id;";
|
|
echo "DROP TABLE IF EXISTS test.$table;";
|
echo "CREATE TABLE test.$table(id int primary key, msg varchar(100));";
|
|
$n=1;
|
|
while (true) {
|
echo "BEGIN;\n";
|
for ($i = 1; $i < 1000; $i++) {
|
echo "INSERT INTO test.$table VALUES($n, MD5(RAND()));\n";
|
$n++;
|
}
|
echo "COMMIT;\n";
|
}
|
Then run several instances of it in parallel:
for a in $(seq 20); do (php test.php $a t$a | mysql &) ; done
|
Now on the slave run this script to repeatedly stop and restart the SQL thread:
( sudo mysql <<< "select version(); start slave sql_thread"; while date; do
|
sudo mysql <<< 'stop slave sql_thread; show slave status\G'
|
gsp_stopped="$(sudo mysql <<< 'SELECT @@GLOBAL.gtid_slave_pos;')"; echo "$gsp_stopped";
|
sudo mysql <<< 'start slave sql_thread;'; sleep 5;
|
sss="$(sudo mysql <<< 'show slave status\G')"
|
gsp="$(sudo mysql <<< 'SELECT @@GLOBAL.gtid_slave_pos')"; echo "$sss"; echo "$gsp";
|
if test "$(awk '$1 == "Slave_SQL_Running:"{print $2}' <<< "$sss")" == "Yes"; then echo; echo "sleeping..."; sleep 5; else echo "things are broken..."; break; fi; done; ) | tee test.log
|
This usually fails with a duplicate key error after a small number of iterations already
Attachments
Issue Links
- relates to
-
MDEV-9138 Relay log position corrupted with parallel replication after interrupting LOAD DATA on master
- Closed