[MDEV-4906] stop slave; start slave; increments the GTId on a slave Created: 2013-08-16  Updated: 2015-11-04  Resolved: 2013-08-20

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.4
Fix Version/s: 10.0.5

Type: Bug Priority: Major
Reporter: Michael Pattison Assignee: Kristian Nielsen
Resolution: Fixed Votes: 1
Labels: gtid
Environment:

Linux pa1-lda-001.dev.tradingscreen.com 2.6.32-220.el6.x86_64 #1 SMP Tue Dec 6 19:48:22 GMT 2011 x86_64 x86_64 x86_64 GNU/Linux

--------------------

@@version

--------------------

10.0.3-MariaDB-log

--------------------



 Description   

With a simple test table, shown here on the slave:

mysql 15:39:00 [PA DEV 002] [eis] [3202]> select * from mfp;
+------+
| c1   |
+------+
|   11 |
|   22 |
|   33 |
|   44 |
|   55 |
+------+
5 rows in set (0.00 sec)
 
mysql 15:39:02 [PA DEV 002] [eis] [3202]> delete from mfp;
Query OK, 5 rows affected (0.00 sec)

I then also deleted all these rows on the master simply to explore the options for repairing replication using

SET GLOBAL gtid_slave_pos = 'XXX'

However, simply stopping and starting the slave ( sometimes twice ) skipped the transactions with no other actions required:

mysql 15:39:09 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1412 |
| gtid_current_pos | 3102-223202-1412 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-409  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:40:24 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.01 sec)
 
mysql 15:45:21 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-410  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:45:24 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.00 sec)
 
mysql 15:45:28 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-410  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:45:30 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.01 sec)
 
mysql 15:45:32 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-411  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:45:34 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.01 sec)
 
mysql 15:45:36 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-411  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:45:37 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.01 sec)
 
mysql 15:45:39 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-412  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:45:41 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.00 sec)
 
mysql 15:45:43 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-412  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:45:44 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.00 sec)
 
mysql 15:45:47 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-413  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:46:23 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.00 sec)
 
mysql 15:46:26 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-413  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:46:28 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.01 sec)
 
mysql 15:46:29 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-414  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:46:31 [PA DEV 002] [eis] [3202]> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)
 
Query OK, 0 rows affected (0.01 sec)
 
mysql 15:46:39 [PA DEV 002] [eis] [3202]> show variables like "%gtid%";
+------------------+------------------+
| Variable_name    | Value            |
+------------------+------------------+
| gtid_binlog_pos  | 3102-223202-1413 |
| gtid_current_pos | 3102-223202-1413 |
| gtid_domain_id   | 3102             |
| gtid_seq_no      | 0                |
| gtid_slave_pos   | 3102-223102-416  |
| gtid_strict_mode | OFF              |
+------------------+------------------+
6 rows in set (0.00 sec)
 
mysql 15:46:41 [PA DEV 002] [eis] [3202]>



 Comments   
Comment by Richard Bensley [ 2013-08-16 ]

Master

MariaDB [test]> show variables like '%gtid%';
+------------------+----------------+
| Variable_name    | Value          |
+------------------+----------------+
| gtid_binlog_pos  | 0-1376396033-1 |
| gtid_current_pos | 0-1376396033-1 |
| gtid_domain_id   | 0              |
| gtid_seq_no      | 0              |
| gtid_slave_pos   |                |
| gtid_strict_mode | OFF            |
+------------------+----------------+
6 rows in set (0.00 sec)
 
MariaDB [test]> create table t (a int auto_increment primary key);
Query OK, 0 rows affected (0.32 sec)
 
rbensley@rbensley:~$ for i in {1..10}; do mysql -P3000 test -e "insert into t values ();"; done

Slave

MariaDB [test]> select count(*) from t;
+----------+
| count(*) |
+----------+
|       10 |
+----------+
1 row in set (0.00 sec)
 
MariaDB [test]> delete from t where a < 4;
Query OK, 3 rows affected (0.09 sec)
 
MariaDB [test]> select count(*) from t;
+----------+
| count(*) |
+----------+
|        7 |
+----------+
1 row in set (0.00 sec)

Master

MariaDB [test]> delete from t;
Query OK, 10 rows affected (0.10 sec)

The Slave will now show a replication failure:

Last_SQL_Error: Could not execute Delete_rows event on table test.t; Can't find record in 't', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-log-bin.000003, end_log_pos 2070
Replicate_Ignore_Server_Ids: 
Master_Server_Id: 1376396033
Master_SSL_Crl: 
Master_SSL_Crlpath: 
Using_Gtid: Slave_Pos

From the slave, I run "stop slave; start slave;"
And the GTID replication rolls forward!

MariaDB [test]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 3000
                Connect_Retry: 60
              Master_Log_File: mysql-log-bin.000003
          Read_Master_Log_Pos: 8641
               Relay_Log_File: mysql-relay-bin.000002
                Relay_Log_Pos: 781
        Relay_Master_Log_File: mysql-log-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 8641
              Relay_Log_Space: 1078
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1376396033
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: Slave_Pos
 

Comment by Elena Stepanova [ 2013-08-16 ]

Test case:

--source include/have_innodb.inc
--source include/master-slave.inc
--source include/have_binlog_format_row.inc

--connection slave
--source include/stop_slave.inc
set sql_log_bin = 0;
alter table mysql.gtid_slave_pos engine = InnoDB;
change master to master_use_gtid = current_pos;
set sql_log_bin = 1;
--source include/start_slave.inc

--connection master
--echo # [master]
create table t1 (i int) engine=innodb;
insert into t1 values (1),(2);

--sync_slave_with_master
--echo # [slave]
-echo #-----------------------------
--echo # Removing a row manually
-echo #-----------------------------
delete from t1 where i = 1;

--connection master
-echo #-----------------------------
--echo # [master]
--echo # Inserting something else, to raise the GTID
--echo # and avoid the error on the next slave connection, caused
--echo # by the non-existing seqno
-echo #-----------------------------
insert into t1 values (3),(4);
insert into t1 values (5),(6);
delete from t1;
select @@gtid_binlog_pos;

--connection slave
--echo # [slave]
-echo #-----------------------------
--echo # Waiting for the last INSERT (it has GTID 0-1-4)
--echo # and checking that replication aborted due to the missing row
-echo #-----------------------------
--let $wait_condition = SELECT @@gtid_slave_pos = '0-1-4'
--source include/wait_condition.inc

show variables like 'gtid%';
query_vertical show slave status;

-echo #-----------------------------
--echo # Stopping and starting the slave,
--echo # waiting till it advanced to the next gtid_slave_pos
--echo # and checking that it aborted again due to the missing row
-echo #-----------------------------

--source include/stop_slave.inc
start slave;
--let $wait_condition = SELECT @@gtid_slave_pos = '0-1-5'
--source include/wait_condition.inc
show variables like 'gtid%';
query_vertical show slave status;

-echo #-----------------------------
--echo # Stopping and starting the slave again,
--echo # and checking that now it did NOT abort
-echo #-----------------------------

--source include/stop_slave.inc
--source include/start_slave.inc

--let $wait_condition = SELECT @@gtid_slave_pos = '0-1-5'
--source include/wait_condition.inc

show variables like 'gtid%';
query_vertical show slave status;

--connection master
--echo # [master]
drop table t1;

--sync_slave_with_master
--echo # [slave]
-echo #-----------------------------
--echo # And now the slave is willing to replicate further
-echo #-----------------------------
--let $wait_condition = SELECT @@gtid_slave_pos = '0-1-6'
--source include/wait_condition.inc
show variables like 'gtid%';
query_vertical show slave status;

--connection master
--source include/rpl_end.inc

Comment by Kristian Nielsen [ 2013-08-20 ]

Pushed to 10.0-base:

MDEV-4906: When event apply fails, next SQL thread start errorneously commits the failing GTID to gtid_slave_pos

When a GTID event is executed, we remember the contained GTID position so that
when we have applied the entire event group we can commit it to
gtid_slave_pos.

However, if the event group fails to apply due to some error and the SQL
thread aborts, the code did not correctly clear the remembered GTID. Thus,
when SQL thread was restarted, the old GTID of the failing event group was
incorrectly updated to gtid_slave_pos when the initial rotate event was
executed, corrupting the GTID position.

Generated at Thu Feb 08 07:00:06 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.