Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-39651

SQL_Remaining_Delay / Seconds_Behind_Master calculated from wrong GTID event timestamp after delayed replica restart

    XMLWordPrintable

Details

    • Bug
    • Status: Confirmed (View Workflow)
    • Minor
    • Resolution: Unresolved
    • 10.11.17
    • 10.11
    • Replication

    Description

      I observe incorrect SQL_Remaining_Delay / Seconds_Behind_Master behavior on a delayed replica using GTID replication.

      After converting a normal replica to delayed replication and restarting the replica, Seconds_Behind_Master appears to be calculated from the timestamp of the first GTID/binlog event from the master after master binlog creation/restart, instead of from the actual delayed event currently waited for by the SQL thread.

      This happens only when GTID replication is used.

      Once SQL_Remaining_Delay reaches 0, Seconds_Behind_Master starts showing the correct configured delay, approximately 43200 seconds.

      Steps to reproduce
      1. Configure master-replica replication with GTID:

      CHANGE MASTER TO MASTER_USE_GTID = slave_pos;
      START SLAVE;
      

      2. Convert the replica to delayed replication:

      STOP SLAVE;
      CHANGE MASTER TO MASTER_DELAY = 43200;
      START SLAVE;
      

      3. Restart MariaDB on the delayed replica:

      systemctl restart mariadb
      

      4. Check replication status on the delayed replica:

      SHOW SLAVE STATUS\G
      SHOW STATUS LIKE 'Uptime';
      

      5. Compare Seconds_Behind_Master with the timestamp of the first event in the master's binlog:

      mysqlbinlog /var/lib/mysql/demu1-yb-mariadb2-tst-bin.000001 | less
      

      Observed result

      At this time:

      Current time: 2026-05-18 12:26:30
      Delayed replica uptime: 2026-05-18 12:24:44 (106 seconds)
      

      The delayed replica shows:

      SQL_Delay: 43200
      Seconds_Behind_Master: 340
      Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event
      

      Seconds_Behind_Master = 340, which is 5 minutes 40 seconds.

      Calculating backwards:

      2026-05-18 12:26:30 - 00:05:40 = 2026-05-18 12:20:50
      

      The master binlog starts at exactly this timestamp:

      #260518 12:20:50 server id 2  end_log_pos 256 CRC32 0x5cb9780f
      Start: binlog v 4, server v 10.11.17-MariaDB-log created 260518 12:20:50 at startup
      

      So Seconds_Behind_Master appears to be based on the timestamp of the first event / binlog start event from the master, not on the event that the delayed SQL thread is actually waiting to apply.

      When SQL_Remaining_Delay becomes 0, Seconds_Behind_Master starts showing the correct delayed replication lag, approximately:

      Seconds_Behind_Master: 43200
      
      

      So the problem seems to affect the period while the delayed SQL thread is waiting for MASTER_DELAY to expire after a replica restart.

      Attachments

        1. generate_events.sh
          0.4 kB
          Shipra Jain
        2. mdev-39651.sh
          0.5 kB
          Shipra Jain
        3. mdev-39651.test
          0.4 kB
          Shipra Jain

        Activity

          People

            bnestere Brandon Nesterenko
            Onesoft Yury Buravtsov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.