Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11163

Replication filters can break multi-source replication when mysqld restarts

    XMLWordPrintable

Details

    Description

      If a MariaDB server has replication filters configured for a multi-source replication connection and then if mysqld restarts, the slave thread will start to filter all data for that slave connection.

      This problem was seen with the following replication topology:

      MariaDB 10.1.18 master ===> MaxScale 1.4.3 binlog router ===> MariaDB 10.1.18 slave

      However, I am not sure if the intermediate MaxScale master is required for this bug to occur.

      The MariaDB master has the following configuration file:

      [mariadb-10.1]
      server_id=1
      log_bin=mariadb-bin
      

      The MariaDB slave has the following configuration file:

      [mariadb-10.1]
      server_id=3
      maxscale_master.replicate_do_db="db1,db2"
      

      And MaxScale has the following configuration file:

      [maxscale]
      threads=4
       
      [Replication]
      type=service
      router=binlogrouter
      version_string=10.0.27-log
      user=repl
      passwd=password
      router_options=server_id=2,binlogdir=/var/log/binlogs,mariadb10-compatibility=1
       
      [Replication Listener]
      type=listener
      service=Replication
      protocol=MySQLClient
      port=3306
      

      I executed the following to set up replication on the MaxScale instance:

      $ mysql -u repl -ppassword -h 127.0.0.1
      Welcome to the MariaDB monitor.  Commands end with ; or \g.
      Your MySQL connection id is 9422
      Server version: 10.0.27-log
       
      Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
       
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
       
      MySQL [(none)]> CHANGE MASTER TO Master_host='172.30.0.32', Master_log_file='mariadb-bin.000001', Master_user='repl', Master_password='password';
      Query OK, 0 rows affected (0.00 sec)
       
      MySQL [(none)]> START SLAVE;
      Query OK, 0 rows affected (0.01 sec)
       
      MySQL [(none)]> SHOW SLAVE STATUS\G
      *************************** 1. row ***************************
                     Slave_IO_State: Binlog Dump
                        Master_Host: 172.30.0.32
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: mariadb-bin.000001
                Read_Master_Log_Pos: 494
                     Relay_Log_File: mariadb-bin.000001
                      Relay_Log_Pos: 494
              Relay_Master_Log_File: mariadb-bin.000001
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB:
                Replicate_Ignore_DB:
                 Replicate_Do_Table:
             Replicate_Ignore_Table:
            Replicate_Wild_Do_Table:
        Replicate_Wild_Ignore_Table:
                         Last_Errno: 0
                         Last_Error:
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 494
                    Relay_Log_Space: 494
                    Until_Condition: None
                     Until_Log_File:
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File:
                 Master_SSL_CA_Path:
                    Master_SSL_Cert:
                  Master_SSL_Cipher:
                     Master_SSL_Key:
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error:
                     Last_SQL_Errno: 0
                     Last_SQL_Error:
        Replicate_Ignore_Server_Ids:
                   Master_Server_Id: 1
                        Master_UUID: 924649fb-9c77-11e6-8949-022c8e9e146b
                   Master_Info_File:
                          SQL_Delay: 0
                SQL_Remaining_Delay: NULL
            Slave_SQL_Running_State: Slave running
                 Master_Retry_Count: 1000
                        Master_Bind:
            Last_IO_Error_TimeStamp:
           Last_SQL_Error_Timestamp:
                     Master_SSL_Crl:
                 Master_SSL_Crlpath:
                 Retrieved_Gtid_Set:
                  Executed_Gtid_Set:
                      Auto_Position:
      1 row in set (0.00 sec)
      

      And I executed the following commands separately on the MariaDB master and slave:

      MariaDB [(none)]> CREATE DATABASE db1;
      Query OK, 1 row affected (0.00 sec)
       
      MariaDB [(none)]> CREATE DATABASE db2;
      Query OK, 1 row affected (0.00 sec)
       
      MariaDB [(none)]> USE db1;
      Database changed
      MariaDB [db1]> CREATE TABLE tab (id int primary key, str varchar(50));
      Query OK, 0 rows affected (0.01 sec)
      

      Then I told the slave to start replicating from MaxScale in the following way:

      MariaDB [db1]> CHANGE MASTER 'maxscale_master' TO Master_host='172.30.0.106', Master_log_file='mariadb-bin.000001', Master_log_pos=1165, Master_user='repl', Master_password='password';
      Query OK, 0 rows affected (0.01 sec)
       
      MariaDB [db1]> SET GLOBAL maxscale_master.replicate_do_db="db1,db2";
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [db1]> START SLAVE 'maxscale_master';
      Query OK, 0 rows affected (0.01 sec)
       
      MariaDB [db1]> SHOW SLAVE 'maxscale_master' STATUS\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 172.30.0.106
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: mariadb-bin.000001
                Read_Master_Log_Pos: 1165
                     Relay_Log_File: ip-172-30-0-46-relay-bin-maxscale_master.000002
                      Relay_Log_Pos: 539
              Relay_Master_Log_File: mariadb-bin.000001
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: db1,db2
                Replicate_Ignore_DB:
                 Replicate_Do_Table:
             Replicate_Ignore_Table:
            Replicate_Wild_Do_Table:
        Replicate_Wild_Ignore_Table:
                         Last_Errno: 0
                         Last_Error:
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 1165
                    Relay_Log_Space: 862
                    Until_Condition: None
                     Until_Log_File:
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File:
                 Master_SSL_CA_Path:
                    Master_SSL_Cert:
                  Master_SSL_Cipher:
                     Master_SSL_Key:
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error:
                     Last_SQL_Errno: 0
                     Last_SQL_Error:
        Replicate_Ignore_Server_Ids:
                   Master_Server_Id: 1
                     Master_SSL_Crl:
                 Master_SSL_Crlpath:
                         Using_Gtid: No
                        Gtid_IO_Pos:
            Replicate_Do_Domain_Ids:
        Replicate_Ignore_Domain_Ids:
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)
      

      To test it, I inserted some data on the master:

      MariaDB [(none)]> USE db1;
      Reading table information for completion of table and column names
      You can turn off this feature to get a quicker startup with -A
       
      Database changed
      MariaDB [db1]> INSERT INTO tab VALUES (1, 'str1');
      Query OK, 1 row affected (0.00 sec)
      

      And I confirmed that the new data exists on the slave:

      MariaDB [db1]> SELECT * FROM tab;
      +----+------+
      | id | str  |
      +----+------+
      |  1 | str1 |
      +----+------+
      1 row in set (0.00 sec)
       
      MariaDB [db1]> SHOW SLAVE 'maxscale_master' STATUS\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 172.30.0.106
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: mariadb-bin.000001
                Read_Master_Log_Pos: 1326
                     Relay_Log_File: ip-172-30-0-46-relay-bin-maxscale_master.000002
                      Relay_Log_Pos: 700
              Relay_Master_Log_File: mariadb-bin.000001
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: db1,db2
                Replicate_Ignore_DB:
                 Replicate_Do_Table:
             Replicate_Ignore_Table:
            Replicate_Wild_Do_Table:
        Replicate_Wild_Ignore_Table:
                         Last_Errno: 0
                         Last_Error:
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 1326
                    Relay_Log_Space: 1023
                    Until_Condition: None
                     Until_Log_File:
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File:
                 Master_SSL_CA_Path:
                    Master_SSL_Cert:
                  Master_SSL_Cipher:
                     Master_SSL_Key:
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error:
                     Last_SQL_Errno: 0
                     Last_SQL_Error:
        Replicate_Ignore_Server_Ids:
                   Master_Server_Id: 1
                     Master_SSL_Crl:
                 Master_SSL_Crlpath:
                         Using_Gtid: No
                        Gtid_IO_Pos:
            Replicate_Do_Domain_Ids:
        Replicate_Ignore_Domain_Ids:
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)
      

      Now let's try restarting mysqld on the slave:

      sudo systemctl restart mariadb
      

      When the slave is back up, replication appears to be working:

      MariaDB [(none)]> SHOW SLAVE 'maxscale_master' STATUS\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 172.30.0.106
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: mariadb-bin.000001
                Read_Master_Log_Pos: 1326
                     Relay_Log_File: ip-172-30-0-46-relay-bin-maxscale_master.000004
                      Relay_Log_Pos: 539
              Relay_Master_Log_File: mariadb-bin.000001
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: db1,db2
                Replicate_Ignore_DB:
                 Replicate_Do_Table:
             Replicate_Ignore_Table:
            Replicate_Wild_Do_Table:
        Replicate_Wild_Ignore_Table:
                         Last_Errno: 0
                         Last_Error:
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 1326
                    Relay_Log_Space: 862
                    Until_Condition: None
                     Until_Log_File:
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File:
                 Master_SSL_CA_Path:
                    Master_SSL_Cert:
                  Master_SSL_Cipher:
                     Master_SSL_Key:
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error:
                     Last_SQL_Errno: 0
                     Last_SQL_Error:
        Replicate_Ignore_Server_Ids:
                   Master_Server_Id: 1
                     Master_SSL_Crl:
                 Master_SSL_Crlpath:
                         Using_Gtid: No
                        Gtid_IO_Pos:
            Replicate_Do_Domain_Ids:
        Replicate_Ignore_Domain_Ids:
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)
      

      But let's try inserting more data into the master to test:

      MariaDB [db1]> INSERT INTO tab VALUES (2, 'str2');
      Query OK, 1 row affected (0.00 sec)
      

      Does this new row exist on the slave?

      MariaDB [db1]> SELECT * FROM tab;
      +----+------+
      | id | str  |
      +----+------+
      |  1 | str1 |
      +----+------+
      1 row in set (0.00 sec)
      

      It does not! But supposedly, replication is still working fine on the slave. We can even see that (Relay_Master_Log_File, Exec_Master_Log_Pos) have changed:

      MariaDB [db1]> SHOW SLAVE 'maxscale_master' STATUS\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 172.30.0.106
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: mariadb-bin.000001
                Read_Master_Log_Pos: 1487
                     Relay_Log_File: ip-172-30-0-46-relay-bin-maxscale_master.000004
                      Relay_Log_Pos: 700
              Relay_Master_Log_File: mariadb-bin.000001
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: db1,db2
                Replicate_Ignore_DB:
                 Replicate_Do_Table:
             Replicate_Ignore_Table:
            Replicate_Wild_Do_Table:
        Replicate_Wild_Ignore_Table:
                         Last_Errno: 0
                         Last_Error:
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 1487
                    Relay_Log_Space: 1023
                    Until_Condition: None
                     Until_Log_File:
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File:
                 Master_SSL_CA_Path:
                    Master_SSL_Cert:
                  Master_SSL_Cipher:
                     Master_SSL_Key:
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error:
                     Last_SQL_Errno: 0
                     Last_SQL_Error:
        Replicate_Ignore_Server_Ids:
                   Master_Server_Id: 1
                     Master_SSL_Crl:
                 Master_SSL_Crlpath:
                         Using_Gtid: No
                        Gtid_IO_Pos:
            Replicate_Do_Domain_Ids:
        Replicate_Ignore_Domain_Ids:
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)
      

      And we can see that the new transaction is present in the slave's relay log:

      $ sudo mysqlbinlog --verbose /var/lib/mysql/ip-172-30-0-46-relay-bin-maxscale_master.000004 --start-position=539
      /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
      /*!40019 SET @@session.max_insert_delayed_threads=0*/;
      /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
      DELIMITER /*!*/;
      # at 4
      #161027 15:42:20 server id 3  end_log_pos 249   Start: binlog v 4, server v 10.1.18-MariaDB created 161027 15:42:20
      BINLOG '
      nFgSWA8DAAAA9QAAAPkAAAAAAAQAMTAuMS4xOC1NYXJpYURCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
      AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAA3QAEGggAAAAICAgCAAAACgoKAAAAAAAA
      AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
      AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
      AAAAAAAAAAAEEwQAAPeDzIg=
      '/*!*/;
      # at 294
      #161027 15:42:20 server id 1  end_log_pos 0     Start: binlog v 4, server v 10.1.18-MariaDB created 161027 15:42:20 at startup
      ROLLBACK/*!*/;
      BINLOG '
      nFgSWA8BAAAA9QAAAAAAAAAAAAQAMTAuMS4xOC1NYXJpYURCAGxvZwAAAAAAAAAAAAAAAAAAAAAA
      AAAAAAAAAAAAAAAAAABKThJYEzgNAAgAEgAEBAQEEgAA3QAEGggAAAAICAgCAAAACgoKAAAAAAAA
      AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
      AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
      AAAAAAAAAAAEEwQAAEEOVFU=
      '/*!*/;
      # at 539
      #161027 15:43:10 server id 1  end_log_pos 1364  GTID 0-1-8 trans
      /*!100101 SET @@session.skip_parallel_replication=0*//*!*/;
      /*!100001 SET @@session.gtid_domain_id=0*//*!*/;
      /*!100001 SET @@session.server_id=1*//*!*/;
      /*!100001 SET @@session.gtid_seq_no=8*//*!*/;
      BEGIN
      /*!*/;
      # at 577
      #161027 15:43:10 server id 1  end_log_pos 1460  Query   thread_id=9     exec_time=0     error_code=0
      use `db1`/*!*/;
      SET TIMESTAMP=1477597390/*!*/;
      SET @@session.pseudo_thread_id=9/*!*/;
      SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
      SET @@session.sql_mode=1342177280/*!*/;
      SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
      /*!\C utf8 *//*!*/;
      SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
      SET @@session.lc_time_names=0/*!*/;
      SET @@session.collation_database=DEFAULT/*!*/;
      INSERT INTO tab VALUES (2, 'str2')
      /*!*/;
      # at 673
      #161027 15:43:10 server id 1  end_log_pos 1487  Xid = 51
      COMMIT/*!*/;
      DELIMITER ;
      # End of log file
      ROLLBACK /* added by mysqlbinlog */;
      /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
      /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
      

      To get replication working again, we have to do the following process on the slave:

      MariaDB [(none)]> STOP SLAVE 'maxscale_master';
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [(none)]> RESET SLAVE 'maxscale_master';
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [(none)]> CHANGE MASTER 'maxscale_master' TO Master_host='172.30.0.106', Master_log_file='mariadb-bin.000001', Master_log_pos=1326, Master_user='repl', Master_password='password';
      Query OK, 0 rows affected (0.01 sec)
       
      MariaDB [(none)]> SET GLOBAL maxscale_master.replicate_do_db="db1,db2";
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [(none)]> START SLAVE 'maxscale_master';
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [(none)]> SHOW SLAVE 'maxscale_master' STATUS\G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 172.30.0.106
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: mariadb-bin.000001
                Read_Master_Log_Pos: 1487
                     Relay_Log_File: ip-172-30-0-46-relay-bin-maxscale_master.000002
                      Relay_Log_Pos: 700
              Relay_Master_Log_File: mariadb-bin.000001
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: db1,db2
                Replicate_Ignore_DB:
                 Replicate_Do_Table:
             Replicate_Ignore_Table:
            Replicate_Wild_Do_Table:
        Replicate_Wild_Ignore_Table:
                         Last_Errno: 0
                         Last_Error:
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 1487
                    Relay_Log_Space: 1023
                    Until_Condition: None
                     Until_Log_File:
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File:
                 Master_SSL_CA_Path:
                    Master_SSL_Cert:
                  Master_SSL_Cipher:
                     Master_SSL_Key:
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error:
                     Last_SQL_Errno: 0
                     Last_SQL_Error:
        Replicate_Ignore_Server_Ids:
                   Master_Server_Id: 1
                     Master_SSL_Crl:
                 Master_SSL_Crlpath:
                         Using_Gtid: No
                        Gtid_IO_Pos:
            Replicate_Do_Domain_Ids:
        Replicate_Ignore_Domain_Ids:
                      Parallel_Mode: conservative
      1 row in set (0.00 sec)
      

      Does the missing row exist on the slave now?:

      MariaDB [(none)]> USE db1;
      Reading table information for completion of table and column names
      You can turn off this feature to get a quicker startup with -A
       
      Database changed
      MariaDB [db1]> SELECT * FROM tab;
      +----+------+
      | id | str  |
      +----+------+
      |  1 | str1 |
      |  2 | str2 |
      +----+------+
      2 rows in set (0.00 sec)
      

      Yes, it does.

      Attachments

        Activity

          People

            Unassigned Unassigned
            GeoffMontee Geoff Montee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.