Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8507

gtid_ignore_duplicates doesn't work with log_slave_updates and multi-source replication

Details

    Description

      When log_slave_updates is enabled, gtid_ignore_duplicates doesn't appear to work with multi-source replication.

      When using a circular replication topology, this means some transactions may be executed twice.

      Let's say we have a topology like this using multi-source replication:

      master101 <=> master102 <=> master105 <=> master101

      master101 and master102 are running MariaDB 10.1.5.

      master105 is running MariaDB 10.0.20.

      Let's insert some data on master101:

      MariaDB [(none)]> INSERT INTO db1.test_table (str) VALUES ('str8');
      Query OK, 1 row affected (0.02 sec)

      Now lets see if the slaves are functional.

      master101 looks OK:

      MariaDB [(none)]> show slave 'master102' status\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 192.168.1.102
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: localhost-bin.000002
                Read_Master_Log_Pos: 485
                     Relay_Log_File: localhost-relay-bin-master102.000002
                      Relay_Log_Pos: 621
              Relay_Master_Log_File: localhost-bin.000002
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 485
                    Relay_Log_Space: 932
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 2
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Current_Pos
                        Gtid_IO_Pos: 1-1-10,3-3-8
            Replicate_Do_Domain_Ids: 
        Replicate_Ignore_Domain_Ids: 
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)
       
      MariaDB [(none)]> show slave 'master105' status\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 192.168.1.105
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: localhost-bin.000004
                Read_Master_Log_Pos: 501
                     Relay_Log_File: localhost-relay-bin-master105.000002
                      Relay_Log_Pos: 637
              Relay_Master_Log_File: localhost-bin.000004
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 501
                    Relay_Log_Space: 948
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 3
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Slave_Pos
                        Gtid_IO_Pos: 1-1-10,3-3-8
            Replicate_Do_Domain_Ids: 
        Replicate_Ignore_Domain_Ids: 
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)

      master102 looks OK too:

      MariaDB [(none)]> show slave 'master101' status\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 192.168.1.101
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: localhost-bin.000002
                Read_Master_Log_Pos: 485
                     Relay_Log_File: localhost-relay-bin-master101.000002
                      Relay_Log_Pos: 776
              Relay_Master_Log_File: localhost-bin.000002
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 485
                    Relay_Log_Space: 1087
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 1
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Current_Pos
                        Gtid_IO_Pos: 1-1-10,3-3-8
            Replicate_Do_Domain_Ids: 
        Replicate_Ignore_Domain_Ids: 
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)
       
      MariaDB [(none)]> show slave 'master105' status\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 192.168.1.105
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: localhost-bin.000004
                Read_Master_Log_Pos: 501
                     Relay_Log_File: localhost-relay-bin-master105.000002
                      Relay_Log_Pos: 792
              Relay_Master_Log_File: localhost-bin.000004
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 501
                    Relay_Log_Space: 1103
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 3
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Slave_Pos
                        Gtid_IO_Pos: 1-1-10,3-3-8
            Replicate_Do_Domain_Ids: 
        Replicate_Ignore_Domain_Ids: 
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)

      master105 got a duplicate key error from master102:

      MariaDB [(none)]> show slave 'master101' status\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 192.168.1.101
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: localhost-bin.000002
                Read_Master_Log_Pos: 485
                     Relay_Log_File: localhost-relay-bin-master101.000002
                      Relay_Log_Pos: 776
              Relay_Master_Log_File: localhost-bin.000002
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 485
                    Relay_Log_Space: 1087
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 1
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Slave_Pos
                        Gtid_IO_Pos: 1-1-10,3-3-8
      1 row in set (0.00 sec)
       
      MariaDB [(none)]> show slave 'master102' status\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 192.168.1.102
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: localhost-bin.000002
                Read_Master_Log_Pos: 485
                     Relay_Log_File: localhost-relay-bin-master102.000002
                      Relay_Log_Pos: 621
              Relay_Master_Log_File: localhost-bin.000002
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: No
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 1062
                         Last_Error: Could not execute Write_rows_v1 event on table db1.test_table; Duplicate entry '8' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log localhost-bin.000002, end_log_pos 458
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 330
                    Relay_Log_Space: 1087
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: NULL
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 1062
                     Last_SQL_Error: Could not execute Write_rows_v1 event on table db1.test_table; Duplicate entry '8' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log localhost-bin.000002, end_log_pos 458
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 2
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Slave_Pos
                        Gtid_IO_Pos: 1-1-10,3-3-8
      1 row in set (0.00 sec)

      Looking at the relay log on master105 from master101, we see:

      #150720 13:29:57 server id 1  end_log_pos 368   GTID 1-1-10
      /*!100001 SET @@session.gtid_domain_id=1*//*!*/;
      /*!100001 SET @@session.server_id=1*//*!*/;
      /*!100001 SET @@session.gtid_seq_no=10*//*!*/;
      BEGIN
      /*!*/;
      # at 659
      # at 710
      #150720 13:29:57 server id 1  end_log_pos 419   Table_map: `db1`.`test_table` mapped to number 18
      #150720 13:29:57 server id 1  end_log_pos 458   Write_rows: table id 18 flags: STMT_END_F
      ### INSERT INTO `db1`.`test_table`
      ### SET
      ###   @1=8
      ###   @2='str8'
      # at 749
      #150720 13:29:57 server id 1  end_log_pos 485   Xid = 39
      COMMIT/*!*/;
      DELIMITER ;
      # End of log file
      ROLLBACK /* added by mysqlbinlog */;
      /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
      /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

      Looking at the relay log on master105 from master102, we see:

      #150720 13:29:57 server id 1  end_log_pos 368   GTID 1-1-10
      /*!100001 SET @@session.gtid_domain_id=1*//*!*/;
      /*!100001 SET @@session.server_id=1*//*!*/;
      /*!100001 SET @@session.gtid_seq_no=10*//*!*/;
      BEGIN
      /*!*/;
      # at 659
      # at 710
      #150720 13:29:57 server id 1  end_log_pos 419   Table_map: `db1`.`test_table` mapped to number 18
      #150720 13:29:57 server id 1  end_log_pos 458   Write_rows: table id 18 flags: STMT_END_F
      ### INSERT INTO `db1`.`test_table`
      ### SET
      ###   @1=8
      ###   @2='str8'
      # at 749
      #150720 13:29:57 server id 1  end_log_pos 485   Xid = 32
      COMMIT/*!*/;
      DELIMITER ;
      # End of log file
      ROLLBACK /* added by mysqlbinlog */;
      /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
      /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

      The GTID information appears to be identical between the relay logs. However, MariaDB is executing both transactions, rather than ignoring one of them.

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            I am not getting the error. Do you observe it all the time, or was it a one-time thing?
            Can you provide a full clean set of binlogs and general logs (create table + insert into the table) so that I could see what's different on my side?
            And please attach the cnf files, maybe there is something in there that I don't take into account.
            Thanks.

            elenst Elena Stepanova added a comment - - edited I am not getting the error. Do you observe it all the time, or was it a one-time thing? Can you provide a full clean set of binlogs and general logs (create table + insert into the table) so that I could see what's different on my side? And please attach the cnf files, maybe there is something in there that I don't take into account. Thanks.
            mysolo Thierry Laurier added a comment - - edited

            it's my initial bug testing with 10.0.17 and 10.1.5 reported to mariadb support.

            I'm observe on randomize insert with watch command.
            Stop all slave/start all slave temporary resolve a error.

            3 server mariadb 10.1.5 datamod01, datamod02, datamod03

            root@datamod01 12:24:10 [test]> select @@global.gtid_domain_id,@@global.server_id,@@global.gtid_ignore_duplicates \G
                    @@global.gtid_domain_id: 1
                         @@global.server_id: 101
            @@global.gtid_ignore_duplicates: 1
             
            root@datamod02 12:03:11 [test]> select @@global.gtid_domain_id,@@global.server_id,@@global.gtid_ignore_duplicates \G
                    @@global.gtid_domain_id: 2
                         @@global.server_id: 102
            @@global.gtid_ignore_duplicates: 1
             
            root@datamod03 12:03:40 [test]> select @@global.gtid_domain_id,@@global.server_id,@@global.gtid_ignore_duplicates \G
                    @@global.gtid_domain_id: 3
                         @@global.server_id: 103
            @@global.gtid_ignore_duplicates: 1

            root@datamod01 12:24:27 [test]>  show slave hosts;
            +-----------+----------+------+-----------+
            | Server_id | Host     | Port | Master_id |
            +-----------+----------+------+-----------+
            |       103 | sqlmod03 | 3306 |       101 |
            |       102 | sqlmod02 | 3306 |       101 |
            +-----------+----------+------+-----------+
             
            root@datamod02 12:24:48 [test]>  show slave hosts;
            +-----------+----------+------+-----------+
            | Server_id | Host     | Port | Master_id |
            +-----------+----------+------+-----------+
            |       103 | sqlmod03 | 3306 |       102 |
            |       101 | sqlmod01 | 3306 |       102 |
            +-----------+----------+------+-----------+
             
            root@datamod03 12:25:31 [test]> show slave hosts;
            +-----------+----------+------+-----------+
            | Server_id | Host     | Port | Master_id |
            +-----------+----------+------+-----------+
            |       101 | sqlmod01 | 3306 |       103 |
            |       102 | sqlmod02 | 3306 |       103 |
            +-----------+----------+------+-----------+

            root@datamod01 13:21:49 [test]> stop all slaves;
            Query OK, 0 rows affected, 2 warnings (0.01 sec)
             
            root@datamod01 13:21:51 [test]> flush slave 'slave-of-sqlmod02';
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod01 13:22:00 [test]> flush slave 'slave-of-sqlmod03';
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod01 13:28:18 [test]> flush master;
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod01 13:29:20 [test]> change master 'slave-of-sqlmod02' to master_use_gtid=slave_pos ;
            Query OK, 0 rows affected (0.00 sec)
            root@datamod01 13:30:13 [test]> change master 'slave-of-sqlmod03' to master_use_gtid=slave_pos ;
            Query OK, 0 rows affected (0.01 sec)
             
            root@datamod01 13:31:52 [test]> start all slaves;
            Query OK, 0 rows affected, 2 warnings (0.01 sec)

            root@datamod02 12:25:48 [test]>  stop all slaves;
            Query OK, 0 rows affected, 2 warnings (0.01 sec)
             
            root@datamod02 13:22:23 [test]>  flush slave 'slave-of-sqlmod01';
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod02 13:22:40 [test]>  flush slave 'slave-of-sqlmod03';
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod02 13:26:42 [test]> flush master;
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod02 13:28:36 [test]> change master 'slave-of-sqlmod01' to master_use_gtid=slave_pos ;
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod02 13:32:30 [test]> change master 'slave-of-sqlmod03' to master_use_gtid=slave_pos ;
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod02 13:32:34 [test]> start all slaves;
            Query OK, 0 rows affected, 2 warnings (0.01 sec)

            root@datamod03 12:29:05 [test]>  stop all slaves;
            Query OK, 0 rows affected, 2 warnings (0.00 sec)
             
            root@datamod03 13:22:25 [test]>  flush slave 'slave-of-sqlmod01';
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod03 13:22:53 [test]>  flush slave 'slave-of-sqlmod02';
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod03 13:26:44 [test]> flush master;
            Query OK, 0 rows affected (0.01 sec)
             
            root@datamod03 13:28:37 [test]> change master 'slave-of-sqlmod01' to master_use_gtid=slave_pos ;
            Query OK, 0 rows affected (0.01 sec)
             
            root@datamod03 13:33:11 [test]> change master 'slave-of-sqlmod02' to master_use_gtid=slave_pos ;
            Query OK, 0 rows affected (0.00 sec)
             
            root@datamod03 13:33:16 [test]> start all slaves;
            Query OK, 0 rows affected, 2 warnings (0.01 sec)

            On all server :

            show create table start_dump\G
                   Table: start_dump
            Create Table: CREATE TABLE `start_dump` (
              `ID` tinyint(3) unsigned NOT NULL DEFAULT '1',
              `DUMP_OK` tinyint(1) DEFAULT '0',
              PRIMARY KEY (`ID`)
            ) ENGINE=MEMORY DEFAULT CHARSET=latin1 MAX_ROWS=1

            Before test:

            Every 1.0s: echo hostname:datamod01; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)'                                                                                                             

            Mon Jul 20 14:16:46 2015
             
            hostname:datamod01
                          Connection_name: slave-of-sqlmod02
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 310
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 310
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72
                          Connection_name: slave-of-sqlmod03
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 310
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 310
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72

            Every 1.0s: echo hostname:datamod02; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)'                                                                                                             

            Mon Jul 20 14:17:00 2015
             
            hostname:datamod02
                          Connection_name: slave-of-sqlmod01
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 310
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 310
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72
                          Connection_name: slave-of-sqlmod03
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 310
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 310
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72

            Every 1.0s: echo hostname:datamod03; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)'                                                                                                             Mon Jul 20 14:17:08 2015
             
            hostname:datamod03
                          Connection_name: slave-of-sqlmod01
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 310
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 310
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72
                          Connection_name: slave-of-sqlmod02
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 310
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 310
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72

            test:

            root@datamod01:/srv/mariadb# watch -n 1 "mysql -uroot -e 'truncate table test.start_dump ; insert into test.start_dump set DUMP_OK=1;'"

            after many truncate, insert:

            Every 1.0s: echo hostname:datamod01; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)'                                                                                                             Mon Jul 20 16:13:25 2015

            hostname:datamod01
                          Connection_name: slave-of-sqlmod02
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 4968
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 4968
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72
                          Connection_name: slave-of-sqlmod03
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 4968
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 4968
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72

            Every 1.0s: echo hostname:datamod02; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)'                                                                                                             

            Mon Jul 20 16:13:52 2015
             
            hostname:datamod02
                          Connection_name: slave-of-sqlmod01
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 68494
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: No
                      Exec_Master_Log_Pos: 63351
                            Last_IO_Error:
                           Last_SQL_Error: Could not execute Write_rows_v1 event on table test.start_dump; Duplicate entry '1' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log bin-log.000001, end_log_pos 63469
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72
                          Connection_name: slave-of-sqlmod03
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 4968
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 4968
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72

            Every 1.0s: echo hostname:datamod03; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)'                                                                                                             Mon Jul 20 16:14:05 2015

            hostname:datamod03
                          Connection_name: slave-of-sqlmod01
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 68494
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 68494
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72
                          Connection_name: slave-of-sqlmod02
                          Master_Log_File: bin-log.000001
                      Read_Master_Log_Pos: 4968
                    Relay_Master_Log_File: bin-log.000001
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                      Exec_Master_Log_Pos: 4968
                            Last_IO_Error:
                           Last_SQL_Error:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72
                           Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72

            show master status:

            root@datamod01 16:25:22 [(none)]> show master status \G
                        File: bin-log.000001
                    Position: 68494
                Binlog_Do_DB: 
            Binlog_Ignore_DB: 
            1 row in set (0.00 sec)
             
            root@datamod02 16:26:13 [(none)]> show master status \G
                        File: bin-log.000001
                    Position: 4968
                Binlog_Do_DB: 
            Binlog_Ignore_DB: 
             
            root@datamod03 16:26:32 [(none)]> show master status \G
                        File: bin-log.000001
                    Position: 4968
                Binlog_Do_DB: 
            Binlog_Ignore_DB: 

            root@datamod02:/srv/mariadb# tail -4 ../data/sqlmod/log/mysqld.err 
            2015-07-20 13:33:42 140478747219712 [Note] Master 'slave-of-sqlmod03': Slave I/O thread: connected to master 'rsqlmod@datamod03:3306',replication starts at GTID position '1-101-399,2-102-41,3-103-72'
            2015-07-20 16:09:31 140478747518720 [ERROR] Master 'slave-of-sqlmod01': Slave SQL: Could not execute Write_rows_v1 event on table test.start_dump; Duplicate entry '1' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log bin-log.000001, end_log_pos 63469, Gtid 1-101-401, Internal MariaDB error code: 1062
            2015-07-20 16:09:31 140478747518720 [Warning] Master 'slave-of-sqlmod01': Slave: Duplicate entry '1' for key 'PRIMARY' Error_code: 1062
            2015-07-20 16:09:31 140478747518720 [ERROR] Master 'slave-of-sqlmod01': Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'bin-log.000001' position 63351; GTID position '1-101-400,2-102-41,3-103-72'

            on datanode01:

            mysqlbinlog bin-log.000001 --base64-output=DECODE-ROWS --verbose
             
            # at 68311
            #150720 16:09:48 server id 101  end_log_pos 68349 	GTID 1-101-433
            /*!100001 SET @@session.gtid_seq_no=433*//*!*/;
            BEGIN
            /*!*/;
            # at 68349
            # at 68398
            #150720 16:09:48 server id 101  end_log_pos 68398 	Table_map: `test`.`start_dump` mapped to number 439 (has triggers)
            #150720 16:09:48 server id 101  end_log_pos 68429 	Write_rows: table id 439 flags: STMT_END_F
            ### INSERT INTO `test`.`start_dump`
            ### SET
            ###   @1=1
            # at 68429
            #150720 16:09:48 server id 101  end_log_pos 68494 	Query	thread_id=9361	exec_time=0	error_code=0
            SET TIMESTAMP=1437401388/*!*/;
            COMMIT
            /*!*/;

            on datanode02:

            # at 4821
            #150720 16:09:48 server id 101  end_log_pos 4859 	GTID 1-101-433 trans
            /*!100001 SET @@session.gtid_seq_no=433*//*!*/;
            BEGIN
            /*!*/;
            # at 4859
            # at 4909
            #150720 16:09:48 server id 101  end_log_pos 4909 	Table_map: `test`.`start_dump` mapped to number 44 (has triggers)
            #150720 16:09:48 server id 101  end_log_pos 4941 	Write_rows: table id 44 flags: STMT_END_F
            ### INSERT INTO `test`.`start_dump`
            ### SET
            ###   @1=1
            ###   @2=0
            # at 4941
            #150720 16:09:48 server id 101  end_log_pos 4968 	Xid = 27517
            COMMIT/*!*/;

            on datanode03:

            # at 4821
            #150720 16:09:48 server id 101  end_log_pos 4859 	GTID 1-101-433 trans
            /*!100001 SET @@session.gtid_seq_no=433*//*!*/;
            BEGIN
            /*!*/;
            # at 4859
            # at 4909
            #150720 16:09:48 server id 101  end_log_pos 4909 	Table_map: `test`.`start_dump` mapped to number 48 (has triggers)
            #150720 16:09:48 server id 101  end_log_pos 4941 	Write_rows: table id 48 flags: STMT_END_F
            ### INSERT INTO `test`.`start_dump`
            ### SET
            ###   @1=1
            ###   @2=0
            # at 4941
            #150720 16:09:48 server id 101  end_log_pos 4968 	Xid = 27120
            COMMIT/*!*/;

            on datanode02:

            relay log for connection slave sqlnode01

             mysqlbinlog mysqld-relay-bin-slave@002dof@002dsqlmod01.000002 --base64-output=DECODE-ROWS --verbose
             
            # at 68596
            #150720 16:09:48 server id 101  end_log_pos 68349 	GTID 1-101-433
            /*!100001 SET @@session.gtid_seq_no=433*//*!*/;
            BEGIN
            /*!*/;
            # at 68634
            # at 68683
            #150720 16:09:48 server id 101  end_log_pos 68398 	Table_map: `test`.`start_dump` mapped to number 439 (has triggers)
            #150720 16:09:48 server id 101  end_log_pos 68429 	Write_rows: table id 439 flags: STMT_END_F
            ### INSERT INTO `test`.`start_dump`
            ### SET
            ###   @1=1
            # at 68714
            #150720 16:09:48 server id 101  end_log_pos 68494 	Query	thread_id=9361	exec_time=0	error_code=0
            SET TIMESTAMP=1437401388/*!*/;
            COMMIT
            /*!*/;

            relay log for connection slave sqlnode03

            mysqlbinlog mysqld-relay-bin-slave@002dof@002dsqlmod03.000002 --base64-output=DECODE-ROWS --verbose
             
            # at 5106
            #150720 16:09:48 server id 101  end_log_pos 4859 	GTID 1-101-433 trans
            /*!100001 SET @@session.gtid_seq_no=433*//*!*/;
            BEGIN
            /*!*/;
            # at 5144
            # at 5194
            #150720 16:09:48 server id 101  end_log_pos 4909 	Table_map: `test`.`start_dump` mapped to number 48 (has triggers)
            #150720 16:09:48 server id 101  end_log_pos 4941 	Write_rows: table id 48 flags: STMT_END_F
            ### INSERT INTO `test`.`start_dump`
            ### SET
            ###   @1=1
            ###   @2=0
            # at 5226
            #150720 16:09:48 server id 101  end_log_pos 4968 	Xid = 27120
            COMMIT/*!*/;

            mysolo Thierry Laurier added a comment - - edited it's my initial bug testing with 10.0.17 and 10.1.5 reported to mariadb support. I'm observe on randomize insert with watch command. Stop all slave/start all slave temporary resolve a error. 3 server mariadb 10.1.5 datamod01, datamod02, datamod03 root@datamod01 12:24:10 [test]> select @@global.gtid_domain_id,@@global.server_id,@@global.gtid_ignore_duplicates \G @@global.gtid_domain_id: 1 @@global.server_id: 101 @@global.gtid_ignore_duplicates: 1   root@datamod02 12:03:11 [test]> select @@global.gtid_domain_id,@@global.server_id,@@global.gtid_ignore_duplicates \G @@global.gtid_domain_id: 2 @@global.server_id: 102 @@global.gtid_ignore_duplicates: 1   root@datamod03 12:03:40 [test]> select @@global.gtid_domain_id,@@global.server_id,@@global.gtid_ignore_duplicates \G @@global.gtid_domain_id: 3 @@global.server_id: 103 @@global.gtid_ignore_duplicates: 1 root@datamod01 12:24:27 [test]> show slave hosts; +-----------+----------+------+-----------+ | Server_id | Host | Port | Master_id | +-----------+----------+------+-----------+ | 103 | sqlmod03 | 3306 | 101 | | 102 | sqlmod02 | 3306 | 101 | +-----------+----------+------+-----------+   root@datamod02 12:24:48 [test]> show slave hosts; +-----------+----------+------+-----------+ | Server_id | Host | Port | Master_id | +-----------+----------+------+-----------+ | 103 | sqlmod03 | 3306 | 102 | | 101 | sqlmod01 | 3306 | 102 | +-----------+----------+------+-----------+   root@datamod03 12:25:31 [test]> show slave hosts; +-----------+----------+------+-----------+ | Server_id | Host | Port | Master_id | +-----------+----------+------+-----------+ | 101 | sqlmod01 | 3306 | 103 | | 102 | sqlmod02 | 3306 | 103 | +-----------+----------+------+-----------+ root@datamod01 13:21:49 [test]> stop all slaves; Query OK, 0 rows affected, 2 warnings (0.01 sec)   root@datamod01 13:21:51 [test]> flush slave 'slave-of-sqlmod02'; Query OK, 0 rows affected (0.00 sec)   root@datamod01 13:22:00 [test]> flush slave 'slave-of-sqlmod03'; Query OK, 0 rows affected (0.00 sec)   root@datamod01 13:28:18 [test]> flush master; Query OK, 0 rows affected (0.00 sec)   root@datamod01 13:29:20 [test]> change master 'slave-of-sqlmod02' to master_use_gtid=slave_pos ; Query OK, 0 rows affected (0.00 sec) root@datamod01 13:30:13 [test]> change master 'slave-of-sqlmod03' to master_use_gtid=slave_pos ; Query OK, 0 rows affected (0.01 sec)   root@datamod01 13:31:52 [test]> start all slaves; Query OK, 0 rows affected, 2 warnings (0.01 sec) root@datamod02 12:25:48 [test]> stop all slaves; Query OK, 0 rows affected, 2 warnings (0.01 sec)   root@datamod02 13:22:23 [test]> flush slave 'slave-of-sqlmod01'; Query OK, 0 rows affected (0.00 sec)   root@datamod02 13:22:40 [test]> flush slave 'slave-of-sqlmod03'; Query OK, 0 rows affected (0.00 sec)   root@datamod02 13:26:42 [test]> flush master; Query OK, 0 rows affected (0.00 sec)   root@datamod02 13:28:36 [test]> change master 'slave-of-sqlmod01' to master_use_gtid=slave_pos ; Query OK, 0 rows affected (0.00 sec)   root@datamod02 13:32:30 [test]> change master 'slave-of-sqlmod03' to master_use_gtid=slave_pos ; Query OK, 0 rows affected (0.00 sec)   root@datamod02 13:32:34 [test]> start all slaves; Query OK, 0 rows affected, 2 warnings (0.01 sec) root@datamod03 12:29:05 [test]> stop all slaves; Query OK, 0 rows affected, 2 warnings (0.00 sec)   root@datamod03 13:22:25 [test]> flush slave 'slave-of-sqlmod01'; Query OK, 0 rows affected (0.00 sec)   root@datamod03 13:22:53 [test]> flush slave 'slave-of-sqlmod02'; Query OK, 0 rows affected (0.00 sec)   root@datamod03 13:26:44 [test]> flush master; Query OK, 0 rows affected (0.01 sec)   root@datamod03 13:28:37 [test]> change master 'slave-of-sqlmod01' to master_use_gtid=slave_pos ; Query OK, 0 rows affected (0.01 sec)   root@datamod03 13:33:11 [test]> change master 'slave-of-sqlmod02' to master_use_gtid=slave_pos ; Query OK, 0 rows affected (0.00 sec)   root@datamod03 13:33:16 [test]> start all slaves; Query OK, 0 rows affected, 2 warnings (0.01 sec) On all server : show create table start_dump\G Table: start_dump Create Table: CREATE TABLE `start_dump` ( `ID` tinyint(3) unsigned NOT NULL DEFAULT '1', `DUMP_OK` tinyint(1) DEFAULT '0', PRIMARY KEY (`ID`) ) ENGINE=MEMORY DEFAULT CHARSET=latin1 MAX_ROWS=1 Before test: Every 1.0s: echo hostname:datamod01; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)' Mon Jul 20 14:16:46 2015   hostname:datamod01 Connection_name: slave-of-sqlmod02 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 310 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 310 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72 Connection_name: slave-of-sqlmod03 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 310 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 310 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72 Every 1.0s: echo hostname:datamod02; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)' Mon Jul 20 14:17:00 2015   hostname:datamod02 Connection_name: slave-of-sqlmod01 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 310 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 310 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72 Connection_name: slave-of-sqlmod03 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 310 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 310 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72 Every 1.0s: echo hostname:datamod03; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)' Mon Jul 20 14:17:08 2015   hostname:datamod03 Connection_name: slave-of-sqlmod01 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 310 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 310 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72 Connection_name: slave-of-sqlmod02 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 310 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 310 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-399,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-399,2-102-41,3-103-72 test: root@datamod01:/srv/mariadb# watch -n 1 "mysql -uroot -e 'truncate table test.start_dump ; insert into test.start_dump set DUMP_OK=1;'" after many truncate, insert: Every 1.0s: echo hostname:datamod01; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)' Mon Jul 20 16:13:25 2015 hostname:datamod01 Connection_name: slave-of-sqlmod02 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 4968 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 4968 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72 Connection_name: slave-of-sqlmod03 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 4968 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 4968 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72 Every 1.0s: echo hostname:datamod02; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)' Mon Jul 20 16:13:52 2015   hostname:datamod02 Connection_name: slave-of-sqlmod01 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 68494 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: No Exec_Master_Log_Pos: 63351 Last_IO_Error: Last_SQL_Error: Could not execute Write_rows_v1 event on table test.start_dump; Duplicate entry '1' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log bin-log.000001, end_log_pos 63469 Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72 Connection_name: slave-of-sqlmod03 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 4968 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 4968 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72 Every 1.0s: echo hostname:datamod03; mysql -uroot -e 'show all slaves status\G' |grep -E '(Connection_name:|Master_Lo|Last_.*_Error|Running|Gtid)' Mon Jul 20 16:14:05 2015 hostname:datamod03 Connection_name: slave-of-sqlmod01 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 68494 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 68494 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72 Connection_name: slave-of-sqlmod02 Master_Log_File: bin-log.000001 Read_Master_Log_Pos: 4968 Relay_Master_Log_File: bin-log.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 4968 Last_IO_Error: Last_SQL_Error: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-433,2-102-41,3-103-72 Gtid_Slave_Pos: 1-101-433,2-102-41,3-103-72 show master status: root@datamod01 16:25:22 [(none)]> show master status \G File: bin-log.000001 Position: 68494 Binlog_Do_DB: Binlog_Ignore_DB: 1 row in set (0.00 sec)   root@datamod02 16:26:13 [(none)]> show master status \G File: bin-log.000001 Position: 4968 Binlog_Do_DB: Binlog_Ignore_DB:   root@datamod03 16:26:32 [(none)]> show master status \G File: bin-log.000001 Position: 4968 Binlog_Do_DB: Binlog_Ignore_DB: root@datamod02:/srv/mariadb# tail -4 ../data/sqlmod/log/mysqld.err 2015-07-20 13:33:42 140478747219712 [Note] Master 'slave-of-sqlmod03': Slave I/O thread: connected to master 'rsqlmod@datamod03:3306',replication starts at GTID position '1-101-399,2-102-41,3-103-72' 2015-07-20 16:09:31 140478747518720 [ERROR] Master 'slave-of-sqlmod01': Slave SQL: Could not execute Write_rows_v1 event on table test.start_dump; Duplicate entry '1' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log bin-log.000001, end_log_pos 63469, Gtid 1-101-401, Internal MariaDB error code: 1062 2015-07-20 16:09:31 140478747518720 [Warning] Master 'slave-of-sqlmod01': Slave: Duplicate entry '1' for key 'PRIMARY' Error_code: 1062 2015-07-20 16:09:31 140478747518720 [ERROR] Master 'slave-of-sqlmod01': Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'bin-log.000001' position 63351; GTID position '1-101-400,2-102-41,3-103-72' on datanode01: mysqlbinlog bin-log.000001 --base64-output=DECODE-ROWS --verbose   # at 68311 #150720 16:09:48 server id 101 end_log_pos 68349 GTID 1-101-433 /*!100001 SET @@session.gtid_seq_no=433*//*!*/; BEGIN /*!*/; # at 68349 # at 68398 #150720 16:09:48 server id 101 end_log_pos 68398 Table_map: `test`.`start_dump` mapped to number 439 (has triggers) #150720 16:09:48 server id 101 end_log_pos 68429 Write_rows: table id 439 flags: STMT_END_F ### INSERT INTO `test`.`start_dump` ### SET ### @1=1 # at 68429 #150720 16:09:48 server id 101 end_log_pos 68494 Query thread_id=9361 exec_time=0 error_code=0 SET TIMESTAMP=1437401388/*!*/; COMMIT /*!*/; on datanode02: # at 4821 #150720 16:09:48 server id 101 end_log_pos 4859 GTID 1-101-433 trans /*!100001 SET @@session.gtid_seq_no=433*//*!*/; BEGIN /*!*/; # at 4859 # at 4909 #150720 16:09:48 server id 101 end_log_pos 4909 Table_map: `test`.`start_dump` mapped to number 44 (has triggers) #150720 16:09:48 server id 101 end_log_pos 4941 Write_rows: table id 44 flags: STMT_END_F ### INSERT INTO `test`.`start_dump` ### SET ### @1=1 ### @2=0 # at 4941 #150720 16:09:48 server id 101 end_log_pos 4968 Xid = 27517 COMMIT/*!*/; on datanode03: # at 4821 #150720 16:09:48 server id 101 end_log_pos 4859 GTID 1-101-433 trans /*!100001 SET @@session.gtid_seq_no=433*//*!*/; BEGIN /*!*/; # at 4859 # at 4909 #150720 16:09:48 server id 101 end_log_pos 4909 Table_map: `test`.`start_dump` mapped to number 48 (has triggers) #150720 16:09:48 server id 101 end_log_pos 4941 Write_rows: table id 48 flags: STMT_END_F ### INSERT INTO `test`.`start_dump` ### SET ### @1=1 ### @2=0 # at 4941 #150720 16:09:48 server id 101 end_log_pos 4968 Xid = 27120 COMMIT/*!*/; on datanode02: relay log for connection slave sqlnode01 mysqlbinlog mysqld-relay-bin-slave@002dof@002dsqlmod01.000002 --base64-output=DECODE-ROWS --verbose   # at 68596 #150720 16:09:48 server id 101 end_log_pos 68349 GTID 1-101-433 /*!100001 SET @@session.gtid_seq_no=433*//*!*/; BEGIN /*!*/; # at 68634 # at 68683 #150720 16:09:48 server id 101 end_log_pos 68398 Table_map: `test`.`start_dump` mapped to number 439 (has triggers) #150720 16:09:48 server id 101 end_log_pos 68429 Write_rows: table id 439 flags: STMT_END_F ### INSERT INTO `test`.`start_dump` ### SET ### @1=1 # at 68714 #150720 16:09:48 server id 101 end_log_pos 68494 Query thread_id=9361 exec_time=0 error_code=0 SET TIMESTAMP=1437401388/*!*/; COMMIT /*!*/; relay log for connection slave sqlnode03 mysqlbinlog mysqld-relay-bin-slave@002dof@002dsqlmod03.000002 --base64-output=DECODE-ROWS --verbose   # at 5106 #150720 16:09:48 server id 101 end_log_pos 4859 GTID 1-101-433 trans /*!100001 SET @@session.gtid_seq_no=433*//*!*/; BEGIN /*!*/; # at 5144 # at 5194 #150720 16:09:48 server id 101 end_log_pos 4909 Table_map: `test`.`start_dump` mapped to number 48 (has triggers) #150720 16:09:48 server id 101 end_log_pos 4941 Write_rows: table id 48 flags: STMT_END_F ### INSERT INTO `test`.`start_dump` ### SET ### @1=1 ### @2=0 # at 5226 #150720 16:09:48 server id 101 end_log_pos 4968 Xid = 27120 COMMIT/*!*/;

            Hi,

            The event 1-101-433 is not interesting in itself. We need the set of binary/relay logs up to the moment when the problem happened, which, according to the error message, is 1-101-401. But better still, just attach the complete logs, don't extract anything from them (binary logs, relay logs, error logs). They should be tiny since the problem happens soon after flush.
            Could you please provide those?

            Also, not to spoil the experiment, it's better to be more careful about flush/start slaves. Since you have the all-directions replication, when you reset and start slaves on the first server, it might start picking up something from other slaves which are not reset yet, etc.

            elenst Elena Stepanova added a comment - Hi, The event 1-101-433 is not interesting in itself. We need the set of binary/relay logs up to the moment when the problem happened, which, according to the error message, is 1-101-401. But better still, just attach the complete logs, don't extract anything from them (binary logs, relay logs, error logs). They should be tiny since the problem happens soon after flush. Could you please provide those? Also, not to spoil the experiment, it's better to be more careful about flush/start slaves. Since you have the all-directions replication, when you reset and start slaves on the first server, it might start picking up something from other slaves which are not reset yet, etc.

            GeoffMontee, thanks.
            It turns out I could not reproduce it because I was trying to use the current tree, while it has already been fixed there in the scope of MDEV-8354. At least I can reproduce it easily enough before the fix – not in 100% cases, but the probability seems to be high, – and cannot reproduce afterwards. 10.0.21 has not been released yet, but 10.1.6 has, please try it (the version mix is not required to reproduce the problem).

            Meanwhile, I'm closing it as fixed along with MDEV-8354. Please comment to re-open if you keep experiencing it after the upgrade.

            elenst Elena Stepanova added a comment - GeoffMontee , thanks. It turns out I could not reproduce it because I was trying to use the current tree, while it has already been fixed there in the scope of MDEV-8354 . At least I can reproduce it easily enough before the fix – not in 100% cases, but the probability seems to be high, – and cannot reproduce afterwards. 10.0.21 has not been released yet, but 10.1.6 has, please try it (the version mix is not required to reproduce the problem). Meanwhile, I'm closing it as fixed along with MDEV-8354 . Please comment to re-open if you keep experiencing it after the upgrade.

            People

              knielsen Kristian Nielsen
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.