[MDEV-32265] seconds_behind_master is inaccurate for Delayed replication Created: 2023-09-27  Updated: 2023-10-24  Resolved: 2023-10-23

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.8.8, 10.4.31, 10.5.22, 10.6.14, 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2
Fix Version/s: 10.4.32, 10.5.23, 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3

Type: Bug Priority: Major
Reporter: Pandikrishnan Gurusamy Assignee: Brandon Nesterenko
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-17516 Replication lag issue using parallel ... Stalled
relates to MDEV-29639 Seconds_Behind_Master is incorrect fo... Closed
relates to MDEV-30619 Parallel Slave SQL Thread Can Update ... Closed

 Description   

Whenever we set up a delayed replica, the SBM is always Zero, until we restart the MariaDB service.

 
MariaDB [(none)]> select @@slave_parallel_threads,@@slave_parallel_mode;
+--------------------------+-----------------------+
| @@slave_parallel_threads | @@slave_parallel_mode |
+--------------------------+-----------------------+
|                        4 | optimistic            |
+--------------------------+-----------------------+
1 row in set (0.003 sec)
 
show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 172.31.44.153
                   Master_User: repl_user
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: master.000012
           Read_Master_Log_Pos: 116937038
                Relay_Log_File: ip-172-31-0-20-relay-bin.000004
                 Relay_Log_Pos: 116937334
         Relay_Master_Log_File: master.000012
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 116937038
               Relay_Log_Space: 116937696
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-11376
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
              Slave_DDL_Groups: 38
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 4395
1 row in set (0.000 sec)
 
MariaDB [(none)]> stop slave;change master to master_delay=3600; start slave;
Query OK, 0 rows affected (1.244 sec)
 
Query OK, 0 rows affected (0.028 sec)
 
Query OK, 0 rows affected (0.006 sec)
 
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 172.31.44.153
                   Master_User: repl_user
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: master.000012
           Read_Master_Log_Pos: 259041585
                Relay_Log_File: ip-172-31-0-20-relay-bin.000002
                 Relay_Log_Pos: 552
         Relay_Master_Log_File: master.000012
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 256
               Relay_Log_Space: 81802918
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-11647
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 3600
           SQL_Remaining_Delay: 3421
       Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event
              Slave_DDL_Groups: 38
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 4510
1 row in set (0.000 sec)
 
MariaDB [(none)]> show processlist;
+-----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
| Id  | User        | Host      | db   | Command      | Time | State                                                          | Info             | Progress |
+-----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
|  11 | root        | localhost | NULL | Query        |    0 | starting                                                       | show processlist |    0.000 |
|  95 | system user |           | NULL | Slave_IO     |    7 | Waiting for master to send event                               | NULL             |    0.000 |
|  97 | system user |           | NULL | Slave_worker |    7 | Waiting for work from SQL thread                               | NULL             |    0.000 |
|  98 | system user |           | NULL | Slave_worker |    7 | Waiting for work from SQL thread                               | NULL             |    0.000 |
|  99 | system user |           | NULL | Slave_worker |    7 | Waiting for work from SQL thread                               | NULL             |    0.000 |
| 100 | system user |           | NULL | Slave_worker |    7 | Waiting for work from SQL thread                               | NULL             |    0.000 |
|  96 | system user |           | NULL | Slave_SQL    |    7 | Waiting until MASTER_DELAY seconds after master executed event | NULL             |    0.000 |
+-----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
7 rows in set (0.000 sec)
 
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 172.31.44.153
                   Master_User: repl_user
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: master.000012
           Read_Master_Log_Pos: 386463738
                Relay_Log_File: ip-172-31-0-20-relay-bin.000002
                 Relay_Log_Pos: 552
         Relay_Master_Log_File: master.000012
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 256
               Relay_Log_Space: 209225071
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-11890
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 3600
           SQL_Remaining_Delay: 3412
       Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event
              Slave_DDL_Groups: 38
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 4510
1 row in set (0.000 sec)
 
MariaDB [(none)]> \q
Bye
 
--Restart the MariaDB service 
[root@ip-172-31-0-20 mysql]# systemctl restart mariadb
 
MariaDB [(none)]> \s
--------------
mysql  Ver 15.1 Distrib 10.6.15-10-MariaDB, for Linux (x86_64) using readline 5.1
 
Connection id:		11
Current database:
Current user:		root@localhost
SSL:			Not in use
Current pager:		stdout
Using outfile:		''
Using delimiter:	;
Server:			MariaDB
Server version:		10.6.15-10-MariaDB-enterprise-log MariaDB Enterprise Server
Protocol version:	10
Connection:		Localhost via UNIX socket
Server characterset:	latin1
Db     characterset:	latin1
Client characterset:	utf8mb3
Conn.  characterset:	utf8mb3
UNIX socket:		/var/lib/mysql/mysql.sock
Uptime:			59 sec
 
Threads: 7  Questions: 6  Slow queries: 0  Opens: 16  Open tables: 10  Queries per second avg: 0.101
 
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 172.31.44.153
                   Master_User: repl_user
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: master.000012
           Read_Master_Log_Pos: 293650040
                Relay_Log_File: ip-172-31-0-20-relay-bin.000002
                 Relay_Log_Pos: 552
         Relay_Master_Log_File: master.000012
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 256
               Relay_Log_Space: 116411373
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 201
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-11712
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 3600
           SQL_Remaining_Delay: 3399
       Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event
              Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 0
1 row in set (0.000 sec)
 
MariaDB [(none)]> show processlist;
+----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
| Id | User        | Host      | db   | Command      | Time | State                                                          | Info             | Progress |
+----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
|  5 | system user |           | NULL | Slave_IO     |    8 | Waiting for master to send event                               | NULL             |    0.000 |
|  7 | system user |           | NULL | Slave_worker |    8 | Waiting for work from SQL thread                               | NULL             |    0.000 |
|  8 | system user |           | NULL | Slave_worker |    8 | Waiting for work from SQL thread                               | NULL             |    0.000 |
|  9 | system user |           | NULL | Slave_worker |    8 | Waiting for work from SQL thread                               | NULL             |    0.000 |
| 10 | system user |           | NULL | Slave_worker |    8 | Waiting for work from SQL thread                               | NULL             |    0.000 |
|  6 | system user |           | NULL | Slave_SQL    |    5 | Waiting until MASTER_DELAY seconds after master executed event | NULL             |    0.000 |
| 11 | root        | localhost | NULL | Query        |    0 | starting                                                       | show processlist |    0.000 |
+----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
7 rows in set (0.000 sec)

If we stop and start the slave in delayed replica,Seconds_Behind_Master is becoming zero.

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 172.31.44.153
                   Master_User: repl_user
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: master.000012
           Read_Master_Log_Pos: 975396633
                Relay_Log_File: ip-172-31-0-20-relay-bin.000002
                 Relay_Log_Pos: 552
         Relay_Master_Log_File: master.000012
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 256
               Relay_Log_Space: 798157966
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 606
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-13023
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 3600
           SQL_Remaining_Delay: 2994
       Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event
              Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 0
1 row in set (0.000 sec)
 
MariaDB [(none)]> stop slave;
Query OK, 0 rows affected (0.003 sec)
 
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.060 sec)
 
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 172.31.44.153
                   Master_User: repl_user
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: master.000012
           Read_Master_Log_Pos: 216567503
                Relay_Log_File: ip-172-31-0-20-relay-bin.000002
                 Relay_Log_Pos: 552
         Relay_Master_Log_File: master.000012
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 256
               Relay_Log_Space: 39328836
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-11565
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 3600
           SQL_Remaining_Delay: 2984
       Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event
              Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 0
1 row in set (0.000 sec)
 
MariaDB [(none)]> show processlist;
+----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
| Id | User        | Host      | db   | Command      | Time | State                                                          | Info             | Progress |
+----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
| 11 | root        | localhost | NULL | Query        |    0 | starting                                                       | show processlist |    0.000 |
| 12 | system user |           | NULL | Slave_IO     |    7 | Waiting for master to send event                               | NULL             |    0.000 |
| 14 | system user |           | NULL | Slave_worker |    7 | Waiting for work from SQL thread                               | NULL             |    0.000 |
| 15 | system user |           | NULL | Slave_worker |    7 | Waiting for work from SQL thread                               | NULL             |    0.000 |
| 16 | system user |           | NULL | Slave_worker |    7 | Waiting for work from SQL thread                               | NULL             |    0.000 |
| 17 | system user |           | NULL | Slave_worker |    7 | Waiting for work from SQL thread                               | NULL             |    0.000 |
| 13 | system user |           | NULL | Slave_SQL    |    5 | Waiting until MASTER_DELAY seconds after master executed event | NULL             |    0.000 |
+----+-------------+-----------+------+--------------+------+----------------------------------------------------------------+------------------+----------+
7 rows in set (0.000 sec)



 Comments   
Comment by Brandon Nesterenko [ 2023-09-27 ]

Hey Andrei,

This is ready for review PR-2772.

Comment by Andrei Elkin [ 2023-10-23 ]

One optional todo comment is made. Otherwise the patch looks good.

Comment by Brandon Nesterenko [ 2023-10-23 ]

Pushed into 10.4 as c5f776e9f

Generated at Thu Feb 08 10:30:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.