Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33445

Replication: "a prior and a subsequent sequence number does exist" cannot be correct after RESET MASTER

    XMLWordPrintable

Details

    Description

      This is an odd replication issue. The main error produced is this:

      11.3.2 63fb478f88e0061d149f5cdd3c4d21d4a35c7bd9 (Debug)

      [ERROR] Error reading packet from server: The binlog on the master is missing the GTID 0-1-3 requested by the slave (even though both a prior and a subsequent sequence number does exist), and GTID strict mode is enabled (server_errno=1236)
      

      After a RESET MASTER (which cannot be correct).

      What is additionally odd is how the testcase behaves:
      The issue (originally seen twice in, but apparently not related to, MDEV-4991) readily reduced to a small SQL file (attached as MDEV-33445.sql), and then accepted some manual cleanup. However, any further editing of the input SQL resulted/results in non-reproducibility, indicating that the length (or somewhere failing syntax) is significant.
      This is especially true for the 3rd (CREATE TABLE t1...) line where removal of the EOL comment results in non-reproducibility.
      Furthermore, the issue can only be replayed using the pquery client: all CLI and MTR attempts fail. The issue is not sporadic.

      Nothing special is required on the master (--no-defaults --log_bin=binlog --server_id=1) however gtid_strict_mode is required on the slave (i.e. --no-defaults --gtid_strict_mode=1 --server_id=2).

      The issue reproduces on a 11.3 debug build from 27 Deb 23, indicating it is not related to MDEV-4991. However, a recent (6 Feb 24) 11.3 optimized build does not reproduce the issue.
      Other versions may be affected also.

      The - possibly concerning - bug here is this part of the error:

      even though both a prior and a subsequent sequence number does exist
      

      This cannot be correct given the RESET MASTER. Even if the file was still in use, then the former part (missing the GTID 0-1-3 requested) cannot be correct in combination with "a subsequent sequence number does exist".

      It could be that the error message is simply incorrect, but the code needs checking as the bug could be more serious. The testcase length or syntax oddity also needs clarification.

      I can reproduce the issue readily on my end, so when a patch is available I can retest.

      Attachments

        Activity

          People

            Elkin Andrei Elkin
            Roel Roel Van de Paar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.