Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12290

Wrong timestamps in binary log causes replication issues

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.2.4, 10.0(EOL), 10.1(EOL), 10.2(EOL)
    • 10.2.6
    • Replication

    Description

      I have configured a MariaDB 10.2.4 GTID Slave with master_delay=3600, and it seems to cause replication to lag behind more and more. Waits trigger even though Seconds_Behind_Master is > 3600.

              Seconds_Behind_Master: 21476
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 3
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: No
                        Gtid_IO_Pos: 
            Replicate_Do_Domain_Ids: 
        Replicate_Ignore_Domain_Ids: 
                      Parallel_Mode: conservative
                          SQL_Delay: 3600
                SQL_Remaining_Delay: 3598
      

      Attachments

        1. patch
          0.6 kB
        2. patch
          0.4 kB

        Activity

          Hi tanj
          I have created a patch for this issue which solves this wrong timestamp issue, This is because when binlog_background thread logs BInlog_checkpoint_log_event time is not set in the thread so , Binlog_checkpoint_log_event->when is zero.
          But I am not able to simulate seconds behind master case. Can you please provide some reproducible test case ?
          I tried this
          master - slave configuration
          both use binlog_format = row
          I have used gtid
          sql_delay=60

          And run lacs of (insert )query on server 1 , with after every 10 insert there was flush logs

          patch

          sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi tanj I have created a patch for this issue which solves this wrong timestamp issue, This is because when binlog_background thread logs BInlog_checkpoint_log_event time is not set in the thread so , Binlog_checkpoint_log_event->when is zero. But I am not able to simulate seconds behind master case. Can you please provide some reproducible test case ? I tried this master - slave configuration both use binlog_format = row I have used gtid sql_delay=60 And run lacs of (insert )query on server 1 , with after every 10 insert there was flush logs patch

          Doesn't the set_time() need to go inside the loop, before the call to
          mysql_bin_log.mark_xid_done()? Otherwise the time will always be the time
          the binlog thread started, which is not correct - it needs to be the current
          time.

          But nice catch, good to get this fixed.

          knielsen Kristian Nielsen added a comment - Doesn't the set_time() need to go inside the loop, before the call to mysql_bin_log.mark_xid_done()? Otherwise the time will always be the time the binlog thread started, which is not correct - it needs to be the current time. But nice catch, good to get this fixed.
          sachin.setiya.007 Sachin Setiya (Inactive) added a comment - - edited

          Hi knielsen,

          Patch updated
          Regards
          sachin patch

          sachin.setiya.007 Sachin Setiya (Inactive) added a comment - - edited Hi knielsen , Patch updated Regards sachin patch

          I am experiencing this issue when attaching a 10.2.8 Slave to a 10.0.24 Master. At every #700101 Binlog checkpoint event, the SQL Delay counter is reapplied and the Slave is continually delayed.

          master# mysql -e 'select @@version;'
          -----------------

          @@version

          -----------------

          10.0.24-MariaDB

          -----------------

          master# mysqlbinlog bin-log.023901 | grep ^#700
          #700101 2:00:00 server id 123 end_log_pos 423 Binlog checkpoint bin-log.023901

          slave# mysql -e 'select @@version;'
          --------------------

          @@version

          --------------------

          10.2.8-MariaDB-log

          --------------------

          Seconds_Behind_Master: 2170
          SQL_Delay: 900
          SQL_Remaining_Delay: 825
          Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event

          Even if the Master is upgraded to 10.0.32 (latest), the bogus timestamps will still be present - as confirmed in a sandbox:

          1. ../sandbox.sh mysql -e 'select @@version;'
            -----------------
            @@version

            -----------------

            10.0.32-MariaDB

            -----------------

          1. mysqlbinlog bin-log.00000* | grep ^#700
            #700101 2:00:00 server id 10032 end_log_pos 347 Binlog checkpoint bin-log.000002
            #700101 2:00:00 server id 10032 end_log_pos 347 Binlog checkpoint bin-log.000003
          zam1081 Marcus Zambrano added a comment - I am experiencing this issue when attaching a 10.2.8 Slave to a 10.0.24 Master. At every #700101 Binlog checkpoint event, the SQL Delay counter is reapplied and the Slave is continually delayed. master# mysql -e 'select @@version;' ----------------- @@version ----------------- 10.0.24-MariaDB ----------------- master# mysqlbinlog bin-log.023901 | grep ^#700 #700101 2:00:00 server id 123 end_log_pos 423 Binlog checkpoint bin-log.023901 slave# mysql -e 'select @@version;' -------------------- @@version -------------------- 10.2.8-MariaDB-log -------------------- Seconds_Behind_Master: 2170 SQL_Delay: 900 SQL_Remaining_Delay: 825 Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event Even if the Master is upgraded to 10.0.32 (latest), the bogus timestamps will still be present - as confirmed in a sandbox: ../sandbox.sh mysql -e 'select @@version;' ----------------- @@version ----------------- 10.0.32-MariaDB ----------------- mysqlbinlog bin-log.00000* | grep ^#700 #700101 2:00:00 server id 10032 end_log_pos 347 Binlog checkpoint bin-log.000002 #700101 2:00:00 server id 10032 end_log_pos 347 Binlog checkpoint bin-log.000003

          Hi zam1081!

          This patched was pushed only to 10.2, I have back-ported this patch to 10.0 , by this merge this will also be available to 10.1.

          • Sachin
          sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi zam1081 ! This patched was pushed only to 10.2, I have back-ported this patch to 10.0 , by this merge this will also be available to 10.1. Sachin

          People

            sachin.setiya.007 Sachin Setiya (Inactive)
            tanj Guillaume Lefranc
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.