[MDEV-10222] When replication starts, slave reports seconds_behind_master 0 for a few seconds Created: 2016-06-13  Updated: 2016-07-03  Resolved: 2016-07-03

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.0.23
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Jaime Crespo Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: Seconds_Behind_Master, lag, need_feedback, replication, restart, slave, status
Environment:

Debian Jessie Linux


Issue Links:
Duplicate
is duplicated by MDEV-513 seconds_behind_master shows nothing f... Open

 Description   

After performing maintenance for a few hours on a host, I start replication again on a 10.0.23 host and execute SHOW SLAVE STATUS;

In the time it takes me to write manually on console START SLAVE; <ENTER> SHOW SLAVE STATUS\G, it shows me once: Seconds_Behind_Master: 0 (not NULL) even if some queries have been already executed by the SQL thread (see SQL executed), as you can see on the following output:

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 7
Server version: 10.0.23-MariaDB-log MariaDB Server
 
Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB PRODUCTION s1 localhost (none) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: db1057.eqiad.wmnet
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db1057-bin.002176
          Read_Master_Log_Pos: 730476185
               Relay_Log_File: db1052-relay-bin.000334
                Relay_Log_Pos: 730476473
        Relay_Master_Log_File: db1057-bin.002176
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 730476185
              Relay_Log_Space: 730477082
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
1 row in set (0.00 sec)
 
MariaDB PRODUCTION s1 localhost (none) > START SLAVE;
Query OK, 0 rows affected (0.01 sec)
 
MariaDB PRODUCTION s1 localhost (none) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: db1057.eqiad.wmnet
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db1057-bin.002176
          Read_Master_Log_Pos: 730476185 
               Relay_Log_File: db1052-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: db1057-bin.002176
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 730476185 
              Relay_Log_Space: 248
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 171974683 
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Slave_Pos 
                  Gtid_IO_Pos: 0-171974683-4085514800
1 row in set (0.00 sec)
 
MariaDB PRODUCTION s1 localhost (none) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: db1057.eqiad.wmnet
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db1057-bin.002177
          Read_Master_Log_Pos: 192678779 
               Relay_Log_File: db1052-relay-bin.000002
                Relay_Log_Pos: 135917
        Relay_Master_Log_File: db1057-bin.002176
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 730611360 
              Relay_Log_Space: 510782336 
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 14944
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 171974683 
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Slave_Pos 
                  Gtid_IO_Pos: 0-171974683-4086305686
1 row in set (0.00 sec)
 
MariaDB PRODUCTION s1 localhost (none) > STOP SLAVE; CHANGE MASTER TO MASTER_SSL=1; START SLAVE;
Query OK, 0 rows affected (0.02 sec)
 
Query OK, 0 rows affected (0.14 sec)
 
Query OK, 0 rows affected (0.00 sec)
 
MariaDB PRODUCTION s1 localhost (none) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: db1057.eqiad.wmnet
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db1057-bin.002176
          Read_Master_Log_Pos: 736578634 
               Relay_Log_File: db1052-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: db1057-bin.002176
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 736578634 
              Relay_Log_Space: 248
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 171974683 
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Slave_Pos 
                  Gtid_IO_Pos: 0-171974683-4085527450
1 row in set (0.00 sec)
 
MariaDB PRODUCTION s1 localhost (none) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: db1057.eqiad.wmnet
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db1057-bin.002176
          Read_Master_Log_Pos: 988725783 
               Relay_Log_File: db1052-relay-bin.000002
                Relay_Log_Pos: 113531
        Relay_Master_Log_File: db1057-bin.002176
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 736691423 
              Relay_Log_Space: 252148189 
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 14811
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 171974683 
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Slave_Pos 
                  Gtid_IO_Pos: 0-180359172-4085945305
1 row in set (0.00 sec)

This seems similar to MDEV-5114, but we are using a higher version. As the only particularities is that we are using GTID and SSL (TLS). I would expect it to be NULL until it is reliably calculated.

We understand the shortcomings of using Seconds_Behind_Master to measure lag, and we have a pt-heartbeat table, but this looks like a bug or something to be documented as "it can return 0 at random".



 Comments   
Comment by Jaime Crespo [ 2016-06-13 ]

Related to MDEV-513 as well?

Comment by Jaime Crespo [ 2016-06-13 ]

Maybe MDEV-7837 Also.

Comment by Elena Stepanova [ 2016-06-13 ]

In the provided output I don't see any difference from MDEV-513 – there is no indication that the zero value is shown further than during execution of the first event after slave restart.
I don't claim it is necessarily the case there, but the output is not enough to say one way or another.
If you think it stays there longer than for the first event, split your STOP SLAVE; CHANGE MASTER TO MASTER_SSL=1; START SLAVE; into separate commands – to prove the point, you don't have to do it momentarily. Instead, do

STOP SLAVE;
SHOW SLAVE STATUS\G
CHANGE MASTER TO MASTER_SSL=1;
START SLAVE;
SHOW SLAVE STATUS\G

In this case, if we see that Exec_Master_Log_Pos has changed, we'll know that the slave has indeed executed some events. If it is so, then we can check whether it is the same as MDEV-7837 – if you are using parallel replication, that is.

Comment by Jaime Crespo [ 2016-06-13 ]

You are right, I thought exec had advanced- my bad (it took a lot to do it, maybe due to GTID checks as compared to regular replication- but that is not a bug). I would merge with the ticket you mentioned and just add as a comment for a suggested fix that until the first statement one is executed, start with seconds behind master NULL.

No, I am not using multisource replication on this host, it is regular one.

Comment by Elena Stepanova [ 2016-07-03 ]

As discussed above, the problem is tracked in MDEV-513.

Generated at Thu Feb 08 07:40:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.