Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.2(EOL), 10.5, 10.6, 10.11, 11.4, 11.8
Description
Seconds_Behind_Master can quickly and sporadically spike from 0 to really high values, and then go back to 0 (independent of running on a parallel or serial replica). This is due to the Seconds_Behind_Master computation having a special idleness check which overrides the value to 0. This idleness condition (and thereby when SBM is calculated/written) is based on the SQL thread queueing status:
idle= mi->rli.sql_thread_caught_up;
|
Which isn't tightly coupled with user events. That is, the timestamp of internal events (e.g. Format_description_log_event, Gtid_list_log_event, and Binlog_checkpoint_event) aren't saved on the replica, and thereby the Seconds_Behind_Master value won't reflect a more recent replica when these events are processed. However, when these events are queued by the SQL thread, the idleness condition no longer holds, and then Seconds_Behind_Master will be calculated and displayed based on the last timestamp from a user transaction, which can be very large (particularly when a replica is configured with SQL_Delay. This is particularly prevalent during log rotation, as all events generated during this time are internal, and should not be considered in Seconds_Behind_Master.
The fix should update the idleness condition to not be based on the queueing state of the SQL thread, but rather that there are no new user events from the master to process (e.g. GTID_IO_Pos == GTID_Slave_Pos).
Test case to reproduce (which requires a small patch for debug_sync integration):
#
|
# TODO: Write me
|
#
|
# References:
|
# MDEV-:
|
#
|
|
--source include/have_innodb.inc
|
--source include/have_log_bin.inc
|
# Format independent
|
--source include/have_binlog_format_row.inc
|
--source include/master-slave.inc
|
|
--connection slave
|
--source include/stop_slave.inc
|
set @@global.debug_dbug="+d,pause_sql_thread_on_next_fde";
|
--source include/start_slave.inc
|
|
--echo #
|
--echo # Initialize test data
|
--connection master
|
|
create table t1 (a int primary key) engine=innodb;
|
insert into t1 values (1);
|
--source include/save_master_gtid.inc
|
|
--echo # Catch slave up with master
|
--connection slave
|
--echo # Debug_sync is set on format description events, so skip 2 (1 for master binlog, 1 for slave relay log)
|
set debug_sync="now wait_for paused_on_fde";
|
set debug_sync="now signal sql_thread_continue";
|
set debug_sync="now wait_for paused_on_fde";
|
set debug_sync="now signal sql_thread_continue";
|
--source include/sync_with_master_gtid.inc
|
|
--echo # Rotate master binary logs to force sql thread to read internal only events
|
--connection master
|
FLUSH LOGS;
|
|
|
--connection slave
|
|
--echo #
|
--echo # Skip through format description events from relay log file & master
|
--echo # binlog until we get to the last one
|
--echo #
|
set debug_sync="now wait_for paused_on_fde";
|
set debug_sync="now signal sql_thread_continue";
|
set debug_sync="now wait_for paused_on_fde";
|
set debug_sync="now signal sql_thread_continue";
|
|
--echo # Now pause on the last format description event
|
set debug_sync="now wait_for paused_on_fde";
|
|
|
--echo # Wait 2s so SBM has a chance to increase..
|
--sleep 2
|
--echo # Seconds_Behind_Master should be zero as user events are up-to-date (but is not)
|
--let $status_items= Seconds_Behind_Master
|
--source include/show_slave_status.inc
|
|
--echo # Resume SQL thread
|
set debug_sync="now signal sql_thread_continue";
|
--echo # TODO clean up debug_dbug and debug_sync status
|
|
--echo #
|
--echo # Cleanup
|
--connection master
|
drop table t1;
|
--source include/save_master_gtid.inc
|
|
--connection slave
|
--source include/sync_with_master_gtid.inc
|
|
--source include/rpl_end.inc
|
--echo # End of .test
|
--echo # TODO ^ Fill out with test name
|
patch for debug_sync (on 10.6 community):
diff --git a/sql/log_event_server.cc b/sql/log_event_server.cc
|
index 8b740195b24..9fa14a6b873 100644
|
--- a/sql/log_event_server.cc
|
+++ b/sql/log_event_server.cc
|
@@ -2426,6 +2426,12 @@ int Format_description_log_event::do_apply_event(rpl_group_info *rgi)
|
Relay_log_info *rli= rgi->rli;
|
DBUG_ENTER("Format_description_log_event::do_apply_event");
|
|
+ DBUG_EXECUTE_IF("pause_sql_thread_on_next_fde", {
|
+ DBUG_ASSERT(!debug_sync_set_action(
|
+ thd, STRING_WITH_LEN(
|
+ "now SIGNAL paused_on_fde WAIT_FOR sql_thread_continue")));
|
+ });
|
+
|
/*
|
As a transaction NEVER spans on 2 or more binlogs:
|
if we have an active transaction at this point, the master died
|
|
Attachments
Issue Links
- relates to
-
MDEV-16091 Seconds_Behind_Master spikes to millions of seconds
-
- Closed
-
STEPS TO REPRODUCE THE ISSUE:
== Set up replication between two MariaDB servers (running on 10.2.38)
== I do not have any other query running on the DB.
== On the "Primary" server, repeatedly flush the binary logs:
I did something like this:
flush binary logs;show binary logs;select now(),sleep(0.25);flush binary logs;show binary logs;select now(),sleep(0.25)... .... .... .... ... ...
== On the "replica" server, repeatedly monitor the seconds behind master by running "show slave status\G"
== I have grepped the output to show only the required details. Pasting a part of it below:
–
Relay_Master_Log_File: mysql-bin-changelog.000455
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 307
1 row in set (0.01 sec)
–
Relay_Master_Log_File: mysql-bin-changelog.000470
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1650
1 row in set (0.00 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000472
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1650
1 row in set (0.01 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000474
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1651
1 row in set (0.00 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000476
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1651
1 row in set (0.00 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000482
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1653
1 row in set (0.00 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000496
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1659
1 row in set (0.00 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000498
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1660
1 row in set (0.00 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000500
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1660
1 row in set (0.00 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000510
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1663
1 row in set (0.01 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000512
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1663
1 row in set (0.00 sec)
–
--
Relay_Master_Log_File: mysql-bin-changelog.000514
Exec_Master_Log_Pos: 4
Seconds_Behind_Master: 1664
1 row in set (0.02 sec)
Looking at the binary logs of the "Primary" instance corresponding to "Relay_Master_Log_File" and "Exec_Master_Log_Pos", I can just see that its the starting of the file (immediately after rotation of older file).
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
| mysql-bin-changelog.000455 | 4 | Format_desc | 123 | 256 | Server ver: 10.2.38-MariaDB-log, Binlog ver: 4 |
| mysql-bin-changelog.000455 | 256 | Gtid_list | 123 | 299 | [0-123-1] |
| mysql-bin-changelog.000455 | 299 | Binlog_checkpoint | 123 | 352 | mysql-bin-changelog.000454 |
| mysql-bin-changelog.000455 | 352 | Binlog_checkpoint | 123 | 405 | mysql-bin-changelog.000455 |
| mysql-bin-changelog.000455 | 405 | Rotate | 123 | 462 | mysql-bin-changelog.000456;pos=4 |
—
Another important observation is that the "seconds behind master" value keeps increasing.
You can refer the above output itself. Each time wrong value is returned, it is greater than the previous value.