Details
-
Bug
-
Status: Stalled (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.1.36
Description
Using parallel replication second behind master is wrongly reporting 0 when SQL thread is stopped and restarted long time after.
This happen by design
https://lists.launchpad.net/maria-developers/msg08958.html
but is really a show stopper for most proxy that send traffic to such slave thinking it's in sync with master.
My understanding is that slave_behind_master is computed after first commit so in this case the master is 2 days in advance and on a fresh restarted slave we get this
| 30 | system user | | tsce_unedic | Connect | 2211 | altering table | OPTIMIZE TABLE `requetes` | 0.000 |
|
And we can see wrong second behind master
Seconds_Behind_Master: 0
|
Using_Gtid: Slave_Pos
|
Gtid_IO_Pos: 0-21-28557589
|
Parallel_Mode: conservative
|
but on his master
gtid_current_pos | 0-21-28570301
|
A possible solution would be to update Seconds_Behind_Master by injecting a fake event in start slave with the max timestamp of all events read by the leader thread and send to to the worker threads .
To reproduce :
--source include/have_innodb.inc
|
--source include/have_binlog_format_mixed.inc
|
--let $rpl_topology=1->2
|
--source include/rpl_init.inc
|
|
# Test various aspects of parallel replication.
|
|
--connection server_1
|
ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
|
|
CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB;
|
--save_master_pos
|
|
--connection server_2
|
--sync_with_master
|
--source include/stop_slave_sql_thread.inc
|
SET GLOBAL slave_parallel_threads=4;
|
|
--connection server_2
|
--sync_with_master
|
--source include/stop_slave.inc
|
SET GLOBAL slave_parallel_threads=1;
|
|
--connection server_1
|
--disable_warnings
|
INSERT INTO t1 VALUES (1, SLEEP(100));
|
--wait 100s
|
INSERT INTO t1 VALUES (1, SLEEP(1));
|
|
--connection server_2
|
--source include/start_slave.inc
|
--let $status_items= Seconds_Behind_Master
|
--source include/show_slave_status.inc
|
--sync_with_master
|
--let $status_items= Seconds_Behind_Master
|
--source include/show_slave_status.inc
|
Attachments
Issue Links
- is duplicated by
-
MDEV-29639 Seconds_Behind_Master is incorrect for Delayed, Parallel Replicas
-
- Closed
-
- relates to
-
MDEV-30458 Consolidate Serial Replica to Parallel Replica with 1 Worker Thread
-
- Open
-
-
MDEV-30619 Parallel Slave SQL Thread Can Update Seconds_Behind_Master with Active Workers
-
- Closed
-
-
MDEV-31745 First Event After Starting a Delayed Parallel Replica Shows 0 Seconds_Behind_Master
-
- Open
-
-
MDEV-7837 Seconds behind Master reports incorrect value when Parallel replication is used
-
- Closed
-
-
MDEV-32265 seconds_behind_master is inaccurate for Delayed replication
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Description |
Using parallel replication second behind master is wrongly reporting 0 when SQL thread is stopped and restarted long time after.
This happen by design https://lists.launchpad.net/maria-developers/msg08958.html but is really a show stopper for most proxy that send traffic to such slave thinking it's in sync with master. My understanding is that slave_behind_master is computed after first commit so in this case the master is 2 days in advance and on a fresh restarted slave we get this {noformat} | 30 | system user | | tsce_unedic | Connect | 2211 | altering table | OPTIMIZE TABLE `requetes` | 0.000 | {noformat} And we can see wrong second behind master {noformat} Seconds_Behind_Master: 0 Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-21-28557589 Parallel_Mode: conservative {noformat} but on his master {noformat} gtid_current_pos | 0-21-28570301 {noformat} A possible solution would be to update Seconds_Behind_Master by injecting a fake event ion start slave with the timestamp of the first event read by the leader thread To reproduce : {noformat} --source include/have_innodb.inc --source include/have_binlog_format_mixed.inc --let $rpl_topology=1->2 --source include/rpl_init.inc # Test various aspects of parallel replication. --connection server_1 ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB; CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB; --save_master_pos --connection server_2 --sync_with_master --source include/stop_slave_sql_thread.inc SET GLOBAL slave_parallel_threads=4; --connection server_2 --sync_with_master --source include/stop_slave.inc SET GLOBAL slave_parallel_threads=1; --connection server_1 --disable_warnings INSERT INTO t1 VALUES (1, SLEEP(3600)); --connection server_2 --source include/start_slave.inc --let $status_items= Seconds_Behind_Master --source include/show_slave_status.inc --sync_with_master --let $status_items= Seconds_Behind_Master --source include/show_slave_status.inc {noformat} |
Using parallel replication second behind master is wrongly reporting 0 when SQL thread is stopped and restarted long time after.
This happen by design https://lists.launchpad.net/maria-developers/msg08958.html but is really a show stopper for most proxy that send traffic to such slave thinking it's in sync with master. My understanding is that slave_behind_master is computed after first commit so in this case the master is 2 days in advance and on a fresh restarted slave we get this {noformat} | 30 | system user | | tsce_unedic | Connect | 2211 | altering table | OPTIMIZE TABLE `requetes` | 0.000 | {noformat} And we can see wrong second behind master {noformat} Seconds_Behind_Master: 0 Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-21-28557589 Parallel_Mode: conservative {noformat} but on his master {noformat} gtid_current_pos | 0-21-28570301 {noformat} A possible solution would be to update Seconds_Behind_Master by injecting a fake event in start slave with the max timestamp of all events read by the leader thread and send to to the worker threads . To reproduce : {noformat} --source include/have_innodb.inc --source include/have_binlog_format_mixed.inc --let $rpl_topology=1->2 --source include/rpl_init.inc # Test various aspects of parallel replication. --connection server_1 ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB; CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB; --save_master_pos --connection server_2 --sync_with_master --source include/stop_slave_sql_thread.inc SET GLOBAL slave_parallel_threads=4; --connection server_2 --sync_with_master --source include/stop_slave.inc SET GLOBAL slave_parallel_threads=1; --connection server_1 --disable_warnings INSERT INTO t1 VALUES (1, SLEEP(100)); --wait 100s INSERT INTO t1 VALUES (1, SLEEP(1)); --connection server_2 --source include/start_slave.inc --let $status_items= Seconds_Behind_Master --source include/show_slave_status.inc --sync_with_master --let $status_items= Seconds_Behind_Master --source include/show_slave_status.inc {noformat} |
Assignee | Andrei Elkin [ elkin ] |
Labels | seconds-behind-master |
Fix Version/s | 10.1 [ 16100 ] |
Assignee | Andrei Elkin [ elkin ] | Sujatha Sivakumar [ sujatha.sivakumar ] |
Fix Version/s | 10..4 [ 24902 ] | |
Fix Version/s | 10.2 [ 14601 ] | |
Fix Version/s | 10.3 [ 22126 ] | |
Fix Version/s | 10.5 [ 23123 ] |
Fix Version/s | 10.4 [ 22408 ] | |
Fix Version/s | 10..4 [ 24902 ] |
Fix Version/s | 10.1 [ 16100 ] |
Assignee | Sujatha Sivakumar [ sujatha.sivakumar ] | Andrei Elkin [ elkin ] |
Workflow | MariaDB v3 [ 90206 ] | MariaDB v4 [ 140987 ] |
Assignee | Andrei Elkin [ elkin ] | Brandon Nesterenko [ JIRAUSER48702 ] |
Fix Version/s | 10.2 [ 14601 ] |
Status | Open [ 1 ] | Confirmed [ 10101 ] |
Link |
This issue is duplicated by |
Status | Confirmed [ 10101 ] | In Progress [ 3 ] |
Assignee | Brandon Nesterenko [ JIRAUSER48702 ] | Andrei Elkin [ elkin ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Link | This issue relates to MDEV-30458 [ MDEV-30458 ] |
Assignee | Andrei Elkin [ elkin ] | Brandon Nesterenko [ JIRAUSER48702 ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Priority | Critical [ 2 ] | Major [ 3 ] |
Fix Version/s | 10.3 [ 22126 ] |
Link | This issue relates to MDEV-31745 [ MDEV-31745 ] |
Link |
This issue relates to |
Fix Version/s | 10.4 [ 22408 ] |
Link |
This issue relates to |