Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9107

GTID Slave Pos of untrack domain ids being updated

Details

    Description

      Let's consider a 3 master setup where each server has 2 replication channels, one to each of the other 2 servers where these replication channels where setup with:

      SETTING: Server_id: 1 IP: 10.0.3.223

      STOP ALL SLAVES;
      CHANGE MASTER "S1_R2" TO
      master_host = "10.0.3.136",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (2)
      CHANGE MASTER "S1_R3" TO
      master_host = "10.0.3.171",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (3)
      START ALL SLAVES;

      SETTING: Server_id: 2 IP: 10.0.3.136

      STOP ALL SLAVES;
      CHANGE MASTER "S2_R1" TO
      master_host = "10.0.3.223",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (1)
      CHANGE MASTER "S2_R3" TO
      master_host = "10.0.3.171",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (3)
      START ALL SLAVES;

      SETTING: Server_id: 3 IP: 10.0.3.171

      STOP ALL SLAVES;
      CHANGE MASTER "S3_R1" TO
      master_host = "10.0.3.223",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (1)
      CHANGE MASTER "S3_R2" TO
      master_host = "10.0.3.136",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (2)
      START ALL SLAVES;
       

      After initially starting all replications:
      1. Stop replication channel S1_R2;
      2. Take note of GTID Slave Pos for domain ID 2 on server 1;
      3. Issue some INSERT |UPDATE|DELETE on server 2;
      4. Take note of GTID Slave Pos for domain ID 2 on server 1;

      Observe that the GTID from steps 2 and 4 are diferent. Replication channel S1_R3 updated the GTID Slave Pos of domain ID 2 despite having been configured to just track domain ID 3!

      When replication channel S1_R2 is brought back online the changes that occured on step 3 will be lost on server 1.

      The solution for this issue seems to be to make each replication channel thread to update only the GTID Slave Pos for the domain IDs it should track as defined by

      {ignore|do}

      _domain_ids.

      Attachments

        Issue Links

          Activity

            I believe there were some expectation that fixing MDEV-9033 would fix this issue but unfortunately it didn't happen.

            rsevero Rodrigo Severo added a comment - I believe there were some expectation that fixing MDEV-9033 would fix this issue but unfortunately it didn't happen.
            esa.korhonen Esa Korhonen added a comment -

            I have now noticed this bug as well, although in a simpler setting. All it needs is a master server which changes its gtid_domain_id, and a slave which is only replicating the old domain (with the DO_DOMAIN_IDS-setting). The slave will update its gtid_slave_pos to include the new domain (5), even when in reality it does not add the events from the new domain:

            @@gtid_binlog_pos |0-3001-7139
            @@gtid_slave_pos |0-3001-7139,5-3001-10
            @@gtid_current_pos |0-3001-7139,5-3001-10

            This means that even gtid_current_pos cannot be trusted to be correct. gtid_binlog_pos and gtid_binlog_state do seem to be correct, but these require log_slave_updates. This has implications for the failover functionality in MaxScale. Server version: 10.2.6

            esa.korhonen Esa Korhonen added a comment - I have now noticed this bug as well, although in a simpler setting. All it needs is a master server which changes its gtid_domain_id, and a slave which is only replicating the old domain (with the DO_DOMAIN_IDS-setting). The slave will update its gtid_slave_pos to include the new domain (5), even when in reality it does not add the events from the new domain: @@gtid_binlog_pos |0-3001-7139 @@gtid_slave_pos |0-3001-7139,5-3001-10 @@gtid_current_pos |0-3001-7139,5-3001-10 This means that even gtid_current_pos cannot be trusted to be correct. gtid_binlog_pos and gtid_binlog_state do seem to be correct, but these require log_slave_updates. This has implications for the failover functionality in MaxScale. Server version: 10.2.6

            Hi esa.korhonen!

            In your case you can just update gtid_slave_pos so that slave gets all the events from master. Lets say master added 3 events in new domain id =X, And old Domain id was Y. SO you can simple set gtid_slave_pos="Y-server-id-seq_no" , So it will forget all the tracking of X domain id. And you will get all the events.

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi esa.korhonen ! In your case you can just update gtid_slave_pos so that slave gets all the events from master. Lets say master added 3 events in new domain id =X, And old Domain id was Y. SO you can simple set gtid_slave_pos="Y-server-id-seq_no" , So it will forget all the tracking of X domain id. And you will get all the events.

            Hi rsevero

            According to documentation do_domain_id is worked as expected , So I am closing this issue as wont fix.

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi rsevero According to documentation do_domain_id is worked as expected , So I am closing this issue as wont fix.

            Test case patch mdev-9107.diff

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Test case patch mdev-9107.diff

            People

              sachin.setiya.007 Sachin Setiya (Inactive)
              rsevero Rodrigo Severo
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.