[MDEV-23970] sql_slave_skip_counter only skip the first event when it is configured with the value bigger than 1 Created: 2020-10-16  Updated: 2020-10-26  Resolved: 2020-10-26

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.4.13, 10.4.15
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Phong Dinh Assignee: Unassigned
Resolution: Not a Bug Votes: 0
Labels: gtid, replication, sql_slave_skip_counter
Environment:

Centos 7



 Description   

Hi,

sql_slave_skip_counter only skips the first event in the slave when it is configured with a value higher than 1

Tested with sql_slave_skip_counter=2 and sql_slave_skip_counter=3 on MariaDB 10.4.15 and 10.4.13
Test case 1 stop replication after insert data to the master
1. Set up a master-slave async replication
2. Create a simple table on the master
create table t3 (id int primary key, tcol01 varchar(10));
3. Insert data to the slave

MariaDB [(none)]> insert into db1.t3 values(101,'test1');
Query OK, 1 row affected (0.003 sec)
 
MariaDB [(none)]> select * from db1.t3;
+------+-------+
| col1 | col2  |
+------+-------+
|  101 | test1 |
+------+-------+
1 row in set (0.000 sec)

4. Insert data to master

MariaDB [(none)]> begin;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [(none)]> insert into db1.t3 values(101,'test1');
Query OK, 1 row affected (0.000 sec)
 
MariaDB [(none)]> insert into db1.t3 values(102,'test2');
Query OK, 1 row affected (0.000 sec)
 
MariaDB [(none)]> commit;
Query OK, 0 rows affected (0.002 sec)
 
MariaDB [(none)]> begin;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [(none)]> insert into db1.t3 values(103,'test3');
Query OK, 1 row affected (0.000 sec)
 
MariaDB [(none)]> insert into db1.t3 values(104,'test4');
Query OK, 1 row affected (0.000 sec)
 
MariaDB [(none)]> commit;
Query OK, 0 rows affected (0.003 sec)
MariaDB [(none)]> begin;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [(none)]> insert into db1.t3 values(105,'test5');
Query OK, 1 row affected (0.000 sec)
 
MariaDB [(none)]> insert into db1.t3 values(106,'test6');
Query OK, 1 row affected (0.000 sec)
 
MariaDB [(none)]> commit;
Query OK, 0 rows affected (0.003 sec)

5. Replication is crashed on slave

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 10.1.10.3
                   Master_User: repl
                   Master_Port: 13306
                 Connect_Retry: 2
               Master_Log_File: mariadb-bin.000001
           Read_Master_Log_Pos: 3131
                Relay_Log_File: relay-bin.000002
                 Relay_Log_Pos: 1626
         Relay_Master_Log_File: mariadb-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: No
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 1062
                    Last_Error: Could not execute Write_rows_v1 event on table db1.t3; Duplicate entry '101' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mariadb-bin.000001, end_log_pos 2194
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 2000
               Relay_Log_Space: 3060
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: NULL
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 1062
                Last_SQL_Error: Could not execute Write_rows_v1 event on table db1.t3; Duplicate entry '101' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mariadb-bin.000001, end_log_pos 2194
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 102
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-102-13
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: conservative
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State:
              Slave_DDL_Groups: 6
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 1
1 row in set (0.000 sec)
 
MariaDB [(none)]> select * from db1.t3;
+------+-------+
| col1 | col2  |
+------+-------+
|  101 | test1 |
+------+-------+

6. Configure sql_slave_skip_counter=2

MariaDB [(none)]> stop slave;
Query OK, 0 rows affected (0.002 sec)
 
MariaDB [(none)]> SET GLOBAL sql_slave_skip_counter = 2;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.005 sec)

7. Verify how many events are skipped.

MariaDB [(none)]> select * from db1.t3;
+------+-------+
| col1 | col2  |
+------+-------+
|  101 | test1 |
|  103 | test3 |
|  104 | test4 |
|  105 | test5 |
|  106 | test6 |
+------+-------+
5 rows in set (0.000 sec)
 
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 10.1.10.3
                   Master_User: repl
                   Master_Port: 13306
                 Connect_Retry: 2
               Master_Log_File: mariadb-bin.000001
           Read_Master_Log_Pos: 3131
                Relay_Log_File: relay-bin.000002
                 Relay_Log_Pos: 1805
         Relay_Master_Log_File: mariadb-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 3131
               Relay_Log_Space: 2108
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 102
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-102-13
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: conservative
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
              Slave_DDL_Groups: 6
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 3
1 row in set (0.000 sec)

Test case 2 (stop replication before insert data to the master)

1. Set up a master-slave async replication
2. Create a simple table on the master
create table tbtest (id int primary key, tcol01 varchar(10));
3. On the slave, configured sql_slave_skip_counter=3

slave1 [localhost:22917] {msandbox} ((none)) > stop slave;
Query OK, 0 rows affected (0.001 sec)
 
slave1 [localhost:22917] {msandbox} ((none)) > SET GLOBAL sql_slave_skip_counter = 3;
Query OK, 0 rows affected (0.000 sec)

4. Insert data

master [localhost:22916] {msandbox} ((none)) > insert into dbtest.tbtest values(1,'test1');
Query OK, 1 row affected (0.002 sec)
master [localhost:22916] {msandbox} ((none)) > insert into dbtest.tbtest values(2,'test2');
Query OK, 1 row affected (0.001 sec)
master [localhost:22916] {msandbox} ((none)) > insert into dbtest.tbtest values(3,'test3');
Query OK, 1 row affected (0.001 sec)
master [localhost:22916] {msandbox} ((none)) > insert into dbtest.tbtest values(4,'test4');
Query OK, 1 row affected (0.002 sec)
master [localhost:22916] {msandbox} ((none)) > select * from dbtest.tbtest;
+----+--------+
| id | tcol01 |
+----+--------+
|  1 | test1  |
|  2 | test2  |
|  3 | test3  |
|  4 | test4  |
+----+--------+
4 rows in set (0.001 sec)

5. Start replication on the slave, check how many events are skipped

slave1 [localhost:22917] {msandbox} ((none)) > start slave;
Query OK, 0 rows affected (0.004 sec)
slave1 [localhost:22917] {msandbox} ((none)) > select * from dbtest.tbtest;
+----+--------+
| id | tcol01 |
+----+--------+
|  2 | test2  |
|  3 | test3  |
|  4 | test4  |
+----+--------+
3 rows in set (0.000 sec)
slave1 [localhost:22917] {msandbox} ((none)) > show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 127.0.0.1
                   Master_User: rsandbox
                   Master_Port: 22916
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 7123
                Relay_Log_File: mysql-relay.000002
                 Relay_Log_Pos: 1386
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 7123
               Relay_Log_Space: 1691
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 100
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-100-34
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: conservative
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
              Slave_DDL_Groups: 1
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 11
1 row in set (0.001 sec)

6. Verify binlog and relay log of slave

slave1 [localhost:22917] {msandbox} ((none)) > SHOW RELAYLOG EVENTS IN 'mysql-relay.000002';
+--------------------+------+-------------------+-----------+-------------+------------------------------------------------+
| Log_name           | Pos  | Event_type        | Server_id | End_log_pos | Info                                           |
+--------------------+------+-------------------+-----------+-------------+------------------------------------------------+
| mysql-relay.000002 |    4 | Format_desc       |       200 |         256 | Server ver: 10.4.15-MariaDB-log, Binlog ver: 4 |
| mysql-relay.000002 |  256 | Rotate            |       100 |           0 | mysql-bin.000001;pos=4                         |
| mysql-relay.000002 |  303 | Format_desc       |       100 |         256 | Server ver: 10.4.15-MariaDB-log, Binlog ver: 4 |
| mysql-relay.000002 |  555 | Gtid_list         |       100 |         285 | []                                             |
| mysql-relay.000002 |  584 | Binlog_checkpoint |       100 |         328 | mysql-bin.000001                               |
| mysql-relay.000002 |  627 | Gtid_list         |       100 |        6407 | [0-100-30]                                     |
| mysql-relay.000002 |  670 | Gtid              |       100 |        6449 | BEGIN GTID 0-100-31                            |
| mysql-relay.000002 |  712 | Query             |       100 |        6555 | insert into dbtest.tbtest values(1,'test1')    |
| mysql-relay.000002 |  818 | Xid               |       100 |        6586 | COMMIT /* xid=193 */                           |
| mysql-relay.000002 |  849 | Gtid              |       100 |        6628 | BEGIN GTID 0-100-32                            |
| mysql-relay.000002 |  891 | Query             |       100 |        6734 | insert into dbtest.tbtest values(2,'test2')    |
| mysql-relay.000002 |  997 | Xid               |       100 |        6765 | COMMIT /* xid=194 */                           |
| mysql-relay.000002 | 1028 | Gtid              |       100 |        6807 | BEGIN GTID 0-100-33                            |
| mysql-relay.000002 | 1070 | Query             |       100 |        6913 | insert into dbtest.tbtest values(3,'test3')    |
| mysql-relay.000002 | 1176 | Xid               |       100 |        6944 | COMMIT /* xid=195 */                           |
| mysql-relay.000002 | 1207 | Gtid              |       100 |        6986 | BEGIN GTID 0-100-34                            |
| mysql-relay.000002 | 1249 | Query             |       100 |        7092 | insert into dbtest.tbtest values(4,'test4')    |
| mysql-relay.000002 | 1355 | Xid               |       100 |        7123 | COMMIT /* xid=197 */                           |
+--------------------+------+-------------------+-----------+-------------+------------------------------------------------+
18 rows in set (0.000 sec)

| mysql-bin.000002 | 1956 | Gtid              |       100 |        1998 | BEGIN GTID 0-100-32                                          |
| mysql-bin.000002 | 1998 | Query             |       100 |        2104 | insert into dbtest.tbtest values(2,'test2')                  |
| mysql-bin.000002 | 2104 | Xid               |       100 |        2135 | COMMIT /* xid=69 */                                          |
| mysql-bin.000002 | 2135 | Gtid              |       100 |        2177 | BEGIN GTID 0-100-33                                          |
| mysql-bin.000002 | 2177 | Query             |       100 |        2283 | insert into dbtest.tbtest values(3,'test3')                  |
| mysql-bin.000002 | 2283 | Xid               |       100 |        2314 | COMMIT /* xid=71 */                                          |
| mysql-bin.000002 | 2314 | Gtid              |       100 |        2356 | BEGIN GTID 0-100-34                                          |
| mysql-bin.000002 | 2356 | Query             |       100 |        2462 | insert into dbtest.tbtest values(4,'test4')                  |
| mysql-bin.000002 | 2462 | Xid               |       100 |        2493 | COMMIT /* xid=73 */



 Comments   
Comment by Alice Sherepa [ 2020-10-16 ]

From the documentation: sql_slave_skip_counter skips the next N events from the master.
When you use SET GLOBAL sql_slave_skip_counter to skip events and the result is in the middle of a group, the replica continues to skip events until it reaches the end of the group. Execution then starts with the next event group.

then in the first case:

+--------------------+------+-------------------+-----------+-------------+------------------------------------------------------------------------------------+
| Log_name           | Pos  | Event_type        | Server_id | End_log_pos | Info                                                                               |
+--------------------+------+-------------------+-----------+-------------+------------------------------------------------------------------------------------+
 
| p-relay-bin.000002 |  949 | Gtid              |         2 |         692 | GTID 0-2-3                                                                         |
| p-relay-bin.000002 |  991 | Query             |         2 |         829 | use `test`; create table t3 (id int primary key, tcol01 varchar(10)) engine=innodb |
| p-relay-bin.000002 | 1128 | Gtid              |         2 |         871 | BEGIN GTID 0-2-4                                                                   |
| p-relay-bin.000002 | 1170 | Query             |         2 |         972 | use `test`; insert into t3 values(101,'test1')                                     |
| p-relay-bin.000002 | 1271 | Query             |         2 |        1073 | use `test`; insert into t3 values(102,'test2')                                     |
| p-relay-bin.000002 | 1372 | Xid               |         2 |        1104 | COMMIT /* xid=21 */                                                                |
| p-relay-bin.000002 | 1403 | Gtid              |         2 |        1146 | BEGIN GTID 0-2-5                                                                   |
| p-relay-bin.000002 | 1445 | Query             |         2 |        1247 | use `test`; insert into t3 values(103,'test3')                                     |
| p-relay-bin.000002 | 1546 | Query             |         2 |        1348 | use `test`; insert into t3 values(104,'test4')                                     |
| p-relay-bin.000002 | 1647 | Xid               |         2 |        1379 | COMMIT /* xid=25 */                                                                |
| p-relay-bin.000002 | 1678 | Gtid              |         2 |        1421 | BEGIN GTID 0-2-6                                                                   |
| p-relay-bin.000002 | 1720 | Query             |         2 |        1522 | use `test`; insert into t3 values(105,'test5')                                     |
| p-relay-bin.000002 | 1821 | Query             |         2 |        1623 | use `test`; insert into t3 values(106,'test6')                                     |
| p-relay-bin.000002 | 1922 | Xid               |         2 |        1654 | COMMIT /* xid=29 */                                                                |

SET GLOBAL sql_slave_skip_counter = 2; --so skip pos. 1128 + 1170 -> pos. 1271 in the middle of transaction -
so start from 1403 – so we get values 103,104,105,106 in t3

The same for the 2nd case - it skips 3 events -
1)BEGIN GTID 0-100-31
2)insert into dbtest.tbtest values(1,'test1')
3)COMMIT /* xid=193 */
and results with values id 2,3,4

Generated at Thu Feb 08 09:26:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.