[MDEV-17516] Replication lag issue using parallel replication - Jira

Details

Type: Bug
Status: Stalled (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.1.36
Fix Version/s: 10.5
Component/s: Replication
Labels:
- seconds-behind-master

Description

Using parallel replication second behind master is wrongly reporting 0 when SQL thread is stopped and restarted long time after.

This happen by design
https://lists.launchpad.net/maria-developers/msg08958.html

but is really a show stopper for most proxy that send traffic to such slave thinking it's in sync with master.

My understanding is that slave_behind_master is computed after first commit so in this case the master is 2 days in advance and on a fresh restarted slave we get this

|   30 | system user  |                      | tsce_unedic | Connect | 2211 | altering table                                                                 | OPTIMIZE TABLE `requetes` |    0.000 |

And we can see wrong second behind master

        Seconds_Behind_Master: 0

                   Using_Gtid: Slave_Pos

                  Gtid_IO_Pos: 0-21-28557589

                Parallel_Mode: conservative

but on his master

gtid_current_pos       | 0-21-28570301

A possible solution would be to update Seconds_Behind_Master by injecting a fake event in start slave with the max timestamp of all events read by the leader thread and send to to the worker threads .

To reproduce :

--source include/have_innodb.inc

--source include/have_binlog_format_mixed.inc

--let $rpl_topology=1->2

--source include/rpl_init.inc

# Test various aspects of parallel replication.

--connection server_1

ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;

CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB;

--save_master_pos

--connection server_2

--sync_with_master

--source include/stop_slave_sql_thread.inc

SET GLOBAL slave_parallel_threads=4;

--connection server_2

--sync_with_master

--source include/stop_slave.inc

SET GLOBAL slave_parallel_threads=1;

--connection server_1

--disable_warnings

INSERT INTO t1 VALUES (1, SLEEP(100));

--wait 100s

INSERT INTO t1 VALUES (1, SLEEP(1));

--connection server_2

--source include/start_slave.inc

--let $status_items= Seconds_Behind_Master

--source include/show_slave_status.inc

--sync_with_master

--let $status_items= Seconds_Behind_Master

--source include/show_slave_status.inc

Attachments

Issue Links

is duplicated by

MDEV-29639 Seconds_Behind_Master is incorrect for Delayed, Parallel Replicas

Closed

relates to

MDEV-30458 Consolidate Serial Replica to Parallel Replica with 1 Worker Thread

Open

MDEV-30619 Parallel Slave SQL Thread Can Update Seconds_Behind_Master with Active Workers

Closed

MDEV-31745 First Event After Starting a Delayed Parallel Replica Shows 0 Seconds_Behind_Master

Open

MDEV-7837 Seconds behind Master reports incorrect value when Parallel replication is used

Closed

MDEV-32265 seconds_behind_master is inaccurate for Delayed replication

Closed

(1 relates to)

Activity

Ascending order - Click to sort in descending order

VAROQUI Stephane created issue - 2018-10-22 09:02

VAROQUI Stephane made changes - 2018-10-22 09:09

Field	Original Value	New Value
Description	Using parallel replication second behind master is wrongly reporting 0 when SQL thread is stopped and restarted long time after. This happen by design https://lists.launchpad.net/maria-developers/msg08958.html but is really a show stopper for most proxy that send traffic to such slave thinking it's in sync with master. My understanding is that slave_behind_master is computed after first commit so in this case the master is 2 days in advance and on a fresh restarted slave we get this {noformat} \| 30 \| system user \| \| tsce_unedic \| Connect \| 2211 \| altering table \| OPTIMIZE TABLE `requetes` \| 0.000 \| {noformat} And we can see wrong second behind master {noformat} Seconds_Behind_Master: 0 Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-21-28557589 Parallel_Mode: conservative {noformat} but on his master {noformat} gtid_current_pos \| 0-21-28570301 {noformat} A possible solution would be to update Seconds_Behind_Master by injecting a fake event ion start slave with the timestamp of the first event read by the leader thread To reproduce : {noformat} --source include/have_innodb.inc --source include/have_binlog_format_mixed.inc --let $rpl_topology=1->2 --source include/rpl_init.inc # Test various aspects of parallel replication. --connection server_1 ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB; CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB; --save_master_pos --connection server_2 --sync_with_master --source include/stop_slave_sql_thread.inc SET GLOBAL slave_parallel_threads=4; --connection server_2 --sync_with_master --source include/stop_slave.inc SET GLOBAL slave_parallel_threads=1; --connection server_1 --disable_warnings INSERT INTO t1 VALUES (1, SLEEP(3600)); --connection server_2 --source include/start_slave.inc --let $status_items= Seconds_Behind_Master --source include/show_slave_status.inc --sync_with_master --let $status_items= Seconds_Behind_Master --source include/show_slave_status.inc {noformat}	Using parallel replication second behind master is wrongly reporting 0 when SQL thread is stopped and restarted long time after. This happen by design https://lists.launchpad.net/maria-developers/msg08958.html but is really a show stopper for most proxy that send traffic to such slave thinking it's in sync with master. My understanding is that slave_behind_master is computed after first commit so in this case the master is 2 days in advance and on a fresh restarted slave we get this {noformat} \| 30 \| system user \| \| tsce_unedic \| Connect \| 2211 \| altering table \| OPTIMIZE TABLE `requetes` \| 0.000 \| {noformat} And we can see wrong second behind master {noformat} Seconds_Behind_Master: 0 Using_Gtid: Slave_Pos Gtid_IO_Pos: 0-21-28557589 Parallel_Mode: conservative {noformat} but on his master {noformat} gtid_current_pos \| 0-21-28570301 {noformat} A possible solution would be to update Seconds_Behind_Master by injecting a fake event in start slave with the max timestamp of all events read by the leader thread and send to to the worker threads . To reproduce : {noformat} --source include/have_innodb.inc --source include/have_binlog_format_mixed.inc --let $rpl_topology=1->2 --source include/rpl_init.inc # Test various aspects of parallel replication. --connection server_1 ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB; CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB; --save_master_pos --connection server_2 --sync_with_master --source include/stop_slave_sql_thread.inc SET GLOBAL slave_parallel_threads=4; --connection server_2 --sync_with_master --source include/stop_slave.inc SET GLOBAL slave_parallel_threads=1; --connection server_1 --disable_warnings INSERT INTO t1 VALUES (1, SLEEP(100)); --wait 100s INSERT INTO t1 VALUES (1, SLEEP(1)); --connection server_2 --source include/start_slave.inc --let $status_items= Seconds_Behind_Master --source include/show_slave_status.inc --sync_with_master --let $status_items= Seconds_Behind_Master --source include/show_slave_status.inc {noformat}

VAROQUI Stephane made changes - 2018-10-22 09:22

Link

This issue relates to ~~MDEV-7837~~ [ ~~MDEV-7837~~ ]

Elena Stepanova made changes - 2018-10-22 20:35

Assignee

Andrei Elkin [ elkin ]

Andrei Elkin made changes - 2018-11-13 18:41

Labels

seconds-behind-master

Elena Stepanova made changes - 2018-12-11 21:17

Fix Version/s

10.1 [ 16100 ]

Andrei Elkin made changes - 2019-06-03 10:21

Assignee

Andrei Elkin [ elkin ]

Sujatha Sivakumar [ sujatha.sivakumar ]

Julien Fritsch made changes - 2020-08-27 07:45

Fix Version/s		10..4 [ 24902 ]
Fix Version/s		10.2 [ 14601 ]
Fix Version/s		10.3 [ 22126 ]
Fix Version/s		10.5 [ 23123 ]

Sergei Golubchik made changes - 2020-09-07 15:02

Fix Version/s		10.4 [ 22408 ]
Fix Version/s	10..4 [ 24902 ]

Julien Fritsch made changes - 2020-11-06 16:07

Fix Version/s

10.1 [ 16100 ]

Julien Fritsch made changes - 2021-03-19 16:18

Assignee

Sujatha Sivakumar [ sujatha.sivakumar ]

Andrei Elkin [ elkin ]

Sergei Golubchik made changes - 2021-12-06 21:33

Workflow

MariaDB v3 [ 90206 ]

MariaDB v4 [ 140987 ]

Andrei Elkin made changes - 2022-02-08 14:10

Assignee

Andrei Elkin [ elkin ]

Brandon Nesterenko [ JIRAUSER48702 ]

Ralf Gebhardt made changes - 2022-08-04 08:44

Fix Version/s

10.2 [ 14601 ]

Brandon Nesterenko made changes - 2022-10-11 18:33

Status

Open [ 1 ]

Confirmed [ 10101 ]

Brandon Nesterenko made changes - 2022-10-11 18:37

Link

This issue is duplicated by ~~MDEV-29639~~ [ ~~MDEV-29639~~ ]

Julien Fritsch made changes - 2022-10-18 14:38

Status

Confirmed [ 10101 ]

In Progress [ 3 ]

Brandon Nesterenko made changes - 2022-10-21 02:17

Assignee	Brandon Nesterenko [ JIRAUSER48702 ]	Andrei Elkin [ elkin ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Brandon Nesterenko made changes - 2023-01-24 16:49

Link

This issue relates to MDEV-30458 [ MDEV-30458 ]

Brandon Nesterenko made changes - 2023-01-30 16:42

Assignee

Andrei Elkin [ elkin ]

Brandon Nesterenko [ JIRAUSER48702 ]

Brandon Nesterenko made changes - 2023-01-30 16:43

Status

In Review [ 10002 ]

Stalled [ 10000 ]

Andrei Elkin made changes - 2023-02-16 17:17

Priority

Critical [ 2 ]

Major [ 3 ]

Julien Fritsch made changes - 2023-04-27 14:25

Fix Version/s

10.3 [ 22126 ]

Brandon Nesterenko made changes - 2023-07-19 20:25

Link

This issue relates to MDEV-31745 [ MDEV-31745 ]

Brandon Nesterenko made changes - 2023-09-27 22:26

Link

This issue relates to ~~MDEV-32265~~ [ ~~MDEV-32265~~ ]

Julien Fritsch made changes - 2024-09-10 15:05

Fix Version/s

10.4 [ 22408 ]

Brandon Nesterenko made changes - 2025-01-20 13:42

Link

This issue relates to ~~MDEV-30619~~ [ ~~MDEV-30619~~ ]

People

Assignee:: Brandon Nesterenko

Reporter:: VAROQUI Stephane

Votes:: 1 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 2018-10-22 09:02

Updated:: 2025-01-20 13:42

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration