Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9108

"GTID not in master's binlog" error with {ignore|do}_domain_ids

Details

    Description

      Let's consider a 3 master setup where each server has 2 replication channels, one to each of the other 2 servers where these replication channels where setup with:

      SETTING: Server_id: 1 IP: 10.0.3.223

      STOP ALL SLAVES;
      CHANGE MASTER "S1_R2" TO
      master_host = "10.0.3.136",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (2)
      CHANGE MASTER "S1_R3" TO
      master_host = "10.0.3.171",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (3)
      START ALL SLAVES;

      SETTING: Server_id: 2 IP: 10.0.3.136

      STOP ALL SLAVES;
      CHANGE MASTER "S2_R1" TO
      master_host = "10.0.3.223",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (1)
      CHANGE MASTER "S2_R3" TO
      master_host = "10.0.3.171",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (3)
      START ALL SLAVES;

      SETTING: Server_id: 3 IP: 10.0.3.171

      STOP ALL SLAVES;
      CHANGE MASTER "S3_R1" TO
      master_host = "10.0.3.223",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (1)
      CHANGE MASTER "S3_R2" TO
      master_host = "10.0.3.136",
      master_user = "replicator",
      master_use_gtid = slave_pos,
      master_password = "password",
      do_domain_ids = (2)
      START ALL SLAVES;

      After initially starting all replications:
      1. stop server 1
      2. issue a INSERT|UPDATE|DELETE on server 2
      3. stop server 2
      4. start server 1. At this point replication channel S1_R3 will go up and running immediately as server 3 never stopped.
      5. start server 2. At this point replication channel S2_R3 will go up and running immediately as server 3 never stopped. BUT replication channel S2_R1 will not go up and will present a message error like “Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 2-2-10, which is not in the master's binlog'” mentioning that server 1 haven't the most up to date transaction from domain id 2.

      Observe that replication channel S2_R1 is presenting a error about a domain ID (2) that it has been explicitly told not to track at all! S2_R1 is supposed to track only domain ID 1.

      The solution for this issue seems to be to MariaDB, on replication channel start, only send the GTID Slave Pos for the domain IDs that it should keep track as defined by

      {ignore|do}

      _domain_ids.

      Attachments

        Issue Links

          Activity

            rsevero Rodrigo Severo created issue -
            rsevero Rodrigo Severo made changes -
            Field Original Value New Value
            Description Let's consider a 3 master setup where each server has 2 replication channels, one to each of the other 2 servers where these replication channels where setup with:
             
            SETTING: Server_id: 1 | IP: 10.0.3.223
            STOP ALL SLAVES;
            CHANGE MASTER "S1_R2" TO
            master_host = "10.0.3.136",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "fabrica",
            do_domain_ids = (2)
            CHANGE MASTER "S1_R3" TO
            master_host = "10.0.3.171",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "fabrica",
            do_domain_ids = (3)
            START ALL SLAVES;
             
             
            SETTING: Server_id: 2 | IP: 10.0.3.136
            STOP ALL SLAVES;
            CHANGE MASTER "S2_R1" TO
            master_host = "10.0.3.223",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "fabrica",
            do_domain_ids = (1)
            CHANGE MASTER "S2_R3" TO
            master_host = "10.0.3.171",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "fabrica",
            do_domain_ids = (3)
            START ALL SLAVES;
             
             
            SETTING: Server_id: 3 | IP: 10.0.3.171
            STOP ALL SLAVES;
            CHANGE MASTER "S3_R1" TO
            master_host = "10.0.3.223",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "fabrica",
            do_domain_ids = (1)
            CHANGE MASTER "S3_R2" TO
            master_host = "10.0.3.136",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "fabrica",
            do_domain_ids = (2)
            START ALL SLAVES;
             
            After initially starting all replications:
            1. stop server 1
            2. issue a INSERT|UPDATE|DELETE on server 2
            3. stop server 2
            4. start server 1. At this point replication channel S1_R3 will go up and running immediately as server 3 never stopped.
            5. start server 2. At this point replication channel S2_R3 will go up and running immediately as server 3 never stopped. BUT replication channel S2_R1 will not go up and will present a message error like “Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 2-2-10, which is not in the master's binlog'” mentioning that server 1 haven't the most up to date transaction from domain id 2.
             
            Observe that replication channel S2_R1 is presenting a error about a domain ID (2) that it has been explicitly told not to track at all! S2_R1 is supposed to track only domain ID 1.
             
            The solution for this issue seems to be to MariaDB, on replication channel start, only send the GTID Slave Pos for the domain IDs that it should keep track as defined by {ignore|do}_domain_ids.
            Let's consider a 3 master setup where each server has 2 replication channels, one to each of the other 2 servers where these replication channels where setup with:
             
            SETTING: Server_id: 1 | IP: 10.0.3.223
            STOP ALL SLAVES;
            CHANGE MASTER "S1_R2" TO
            master_host = "10.0.3.136",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (2)
            CHANGE MASTER "S1_R3" TO
            master_host = "10.0.3.171",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (3)
            START ALL SLAVES;
             
             
            SETTING: Server_id: 2 | IP: 10.0.3.136
            STOP ALL SLAVES;
            CHANGE MASTER "S2_R1" TO
            master_host = "10.0.3.223",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (1)
            CHANGE MASTER "S2_R3" TO
            master_host = "10.0.3.171",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (3)
            START ALL SLAVES;
             
             
            SETTING: Server_id: 3 | IP: 10.0.3.171
            STOP ALL SLAVES;
            CHANGE MASTER "S3_R1" TO
            master_host = "10.0.3.223",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (1)
            CHANGE MASTER "S3_R2" TO
            master_host = "10.0.3.136",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (2)
            START ALL SLAVES;
             
            After initially starting all replications:
            1. stop server 1
            2. issue a INSERT|UPDATE|DELETE on server 2
            3. stop server 2
            4. start server 1. At this point replication channel S1_R3 will go up and running immediately as server 3 never stopped.
            5. start server 2. At this point replication channel S2_R3 will go up and running immediately as server 3 never stopped. BUT replication channel S2_R1 will not go up and will present a message error like “Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 2-2-10, which is not in the master's binlog'” mentioning that server 1 haven't the most up to date transaction from domain id 2.
             
            Observe that replication channel S2_R1 is presenting a error about a domain ID (2) that it has been explicitly told not to track at all! S2_R1 is supposed to track only domain ID 1.
             
            The solution for this issue seems to be to MariaDB, on replication channel start, only send the GTID Slave Pos for the domain IDs that it should keep track as defined by {ignore|do}_domain_ids.
            elenst Elena Stepanova made changes -
            Description Let's consider a 3 master setup where each server has 2 replication channels, one to each of the other 2 servers where these replication channels where setup with:
             
            SETTING: Server_id: 1 | IP: 10.0.3.223
            STOP ALL SLAVES;
            CHANGE MASTER "S1_R2" TO
            master_host = "10.0.3.136",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (2)
            CHANGE MASTER "S1_R3" TO
            master_host = "10.0.3.171",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (3)
            START ALL SLAVES;
             
             
            SETTING: Server_id: 2 | IP: 10.0.3.136
            STOP ALL SLAVES;
            CHANGE MASTER "S2_R1" TO
            master_host = "10.0.3.223",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (1)
            CHANGE MASTER "S2_R3" TO
            master_host = "10.0.3.171",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (3)
            START ALL SLAVES;
             
             
            SETTING: Server_id: 3 | IP: 10.0.3.171
            STOP ALL SLAVES;
            CHANGE MASTER "S3_R1" TO
            master_host = "10.0.3.223",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (1)
            CHANGE MASTER "S3_R2" TO
            master_host = "10.0.3.136",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (2)
            START ALL SLAVES;
             
            After initially starting all replications:
            1. stop server 1
            2. issue a INSERT|UPDATE|DELETE on server 2
            3. stop server 2
            4. start server 1. At this point replication channel S1_R3 will go up and running immediately as server 3 never stopped.
            5. start server 2. At this point replication channel S2_R3 will go up and running immediately as server 3 never stopped. BUT replication channel S2_R1 will not go up and will present a message error like “Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 2-2-10, which is not in the master's binlog'” mentioning that server 1 haven't the most up to date transaction from domain id 2.
             
            Observe that replication channel S2_R1 is presenting a error about a domain ID (2) that it has been explicitly told not to track at all! S2_R1 is supposed to track only domain ID 1.
             
            The solution for this issue seems to be to MariaDB, on replication channel start, only send the GTID Slave Pos for the domain IDs that it should keep track as defined by {ignore|do}_domain_ids.
            Let's consider a 3 master setup where each server has 2 replication channels, one to each of the other 2 servers where these replication channels where setup with:
             
            {code:sql|title=SETTING: Server_id: 1 IP: 10.0.3.223}
            STOP ALL SLAVES;
            CHANGE MASTER "S1_R2" TO
            master_host = "10.0.3.136",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (2)
            CHANGE MASTER "S1_R3" TO
            master_host = "10.0.3.171",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (3)
            START ALL SLAVES;
            {code}
             
            {code:sql|title=SETTING: Server_id: 2 IP: 10.0.3.136}
            STOP ALL SLAVES;
            CHANGE MASTER "S2_R1" TO
            master_host = "10.0.3.223",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (1)
            CHANGE MASTER "S2_R3" TO
            master_host = "10.0.3.171",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (3)
            START ALL SLAVES;
            {code}
             
            {code:sql|title=SETTING: Server_id: 3 IP: 10.0.3.171}
            STOP ALL SLAVES;
            CHANGE MASTER "S3_R1" TO
            master_host = "10.0.3.223",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (1)
            CHANGE MASTER "S3_R2" TO
            master_host = "10.0.3.136",
            master_user = "replicator",
            master_use_gtid = slave_pos,
            master_password = "password",
            do_domain_ids = (2)
            START ALL SLAVES;
            {code}

            After initially starting all replications:
            1. stop server 1
            2. issue a INSERT|UPDATE|DELETE on server 2
            3. stop server 2
            4. start server 1. At this point replication channel S1_R3 will go up and running immediately as server 3 never stopped.
            5. start server 2. At this point replication channel S2_R3 will go up and running immediately as server 3 never stopped. BUT replication channel S2_R1 will not go up and will present a message error like “Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 2-2-10, which is not in the master's binlog'” mentioning that server 1 haven't the most up to date transaction from domain id 2.
             
            Observe that replication channel S2_R1 is presenting a error about a domain ID (2) that it has been explicitly told not to track at all! S2_R1 is supposed to track only domain ID 1.
             
            The solution for this issue seems to be to MariaDB, on replication channel start, only send the GTID Slave Pos for the domain IDs that it should keep track as defined by {ignore|do}_domain_ids.
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova made changes -
            Labels need_feedback
            elenst Elena Stepanova made changes -
            Labels need_feedback
            elenst Elena Stepanova made changes -
            Assignee Elena Stepanova [ elenst ]
            rsevero Rodrigo Severo made changes -
            Affects Version/s 10.1.10 [ 20402 ]
            elenst Elena Stepanova made changes -
            Assignee Elena Stepanova [ elenst ] Kristian Nielsen [ knielsen ]
            DrMurx Jan Kunzmann (Inactive) made changes -
            Elkin Andrei Elkin made changes -
            elenst Elena Stepanova made changes -
            Fix Version/s 10.1 [ 16100 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 72497 ] MariaDB v4 [ 139906 ]
            michaeldg Michaël de groot made changes -
            michaeldg Michaël de groot made changes -
            vlad.radu Vlad Radu made changes -
            Labels foundation

            People

              knielsen Kristian Nielsen
              rsevero Rodrigo Severo
              Votes:
              3 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.