[MDEV-13431] wsrep_gtid_mode uses wrong GTID for transaction committed by slave thread Created: 2017-08-02  Updated: 2020-08-25  Resolved: 2017-12-25

Status: Closed
Project: MariaDB Server
Component/s: Galera, Replication, Storage Engine - InnoDB
Affects Version/s: 10.1.25
Fix Version/s: 10.1.31

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Sachin Setiya (Inactive)
Resolution: Fixed Votes: 2
Labels: galera, gtid, replication, wsrep

Issue Links:
Problem/Incident
causes MDEV-10227 MariaDB Galera cluster gtid's falling... Closed
is caused by MDEV-20720 Galera: Replicate MariaDB GTID to oth... Closed
Relates
relates to MDEV-8458 Galera Cluster replication stream doe... Closed

 Description   

This is similar to MDEV-8458, but this issue also effects MariaDB 10.1 with wsrep_gtid_mode=ON.

When wsrep_gtid_mode is enabled, transactions that are replicated within a cluster by Galera receive a GTID where the domain_id is specified by wsrep_gtid_domain_id, the server_id is specified by server_id, and the seq_no is incremented for each transaction that is committed in the domain.

It does not seem to work this way for transactions that are replicated by an asynchronous slave thread within a Galera cluster.

For example, let's say that we have two clusters and one cluster replicates from the other using GTID replication.

On cluster1, we see the following:

MariaDB [(none)]> show global variables like '%gtid%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| gtid_binlog_pos        | 1-1-1 |
| gtid_binlog_state      | 1-1-1 |
| gtid_current_pos       | 1-1-1 |
| gtid_domain_id         | 1     |
| gtid_ignore_duplicates | OFF   |
| gtid_slave_pos         |       |
| gtid_strict_mode       | OFF   |
| wsrep_gtid_domain_id   | 1     |
| wsrep_gtid_mode        | ON    |
+------------------------+-------+
9 rows in set (0.00 sec)

On cluster2, we see the following:

MariaDB [(none)]> show global variables like '%gtid%';
+------------------------+-------------+
| Variable_name          | Value       |
+------------------------+-------------+
| gtid_binlog_pos        | 2-2-2       |
| gtid_binlog_state      | 2-2-2       |
| gtid_current_pos       | 1-1-1,2-2-2 |
| gtid_domain_id         | 2           |
| gtid_ignore_duplicates | OFF         |
| gtid_slave_pos         | 1-1-1       |
| gtid_strict_mode       | OFF         |
| wsrep_gtid_domain_id   | 2           |
| wsrep_gtid_mode        | ON          |
+------------------------+-------------+
9 rows in set (0.00 sec)

One node in cluster2 is a slave of one node in cluster1:

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.30.0.32
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.000002
          Read_Master_Log_Pos: 428
               Relay_Log_File: ip-172-30-0-46-relay-bin.000002
                Relay_Log_Pos: 644
        Relay_Master_Log_File: mariadb-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 428
              Relay_Log_Space: 951
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Slave_Pos
                  Gtid_IO_Pos: 1-1-1
      Replicate_Do_Domain_Ids:
  Replicate_Ignore_Domain_Ids:
                Parallel_Mode: conservative
1 row in set (0.00 sec)

If we commit a transaction on cluster1, we would expect it to have the GTID 1-1-2 on cluster1, and either 2-1-3 or 2-2-3 on cluster2, depending on whether it uses the server_id of the originating cluster or replaces it with its own. Does that actually happen?:

Let's say that we execute the following on cluster1:

MariaDB [(none)]> insert into db1.tab values (1, 'str1');
Query OK, 1 row affected (0.00 sec)

What GTID does this transaction have on each cluster?

Here's the binlog event on the node in cluster1 where the transaction originated:

# at 428
#170802 16:51:02 server id 1  end_log_pos 466   GTID 1-1-2 trans
/*!100001 SET @@session.gtid_seq_no=2*//*!*/;
BEGIN
/*!*/;
# at 466
# at 523
#170802 16:51:02 server id 1  end_log_pos 523   Annotate_rows:
#Q> insert into db1.tab values (1, 'str1')
#170802 16:51:02 server id 1  end_log_pos 567   Table_map: `db1`.`tab` mapped to number 18
# at 567
#170802 16:51:02 server id 1  end_log_pos 606   Write_rows: table id 18 flags: STMT_END_F
 
BINLOG '
NjuCWRMBAAAALAAAADcCAAAAABIAAAAAAAEAA2RiMQADdGFiAAIDDwKWAAI=
NjuCWRcBAAAAJwAAAF4CAAAAABIAAAAAAAEAAv/8AQAAAARzdHIx
'/*!*/;
### INSERT INTO `db1`.`tab`
### SET
###   @1=1
###   @2='str1'
# at 606
#170802 16:51:02 server id 1  end_log_pos 633   Xid = 10
COMMIT/*!*/;

And here's the binlog event on the node in cluster2 that is acting as a slave to cluster1:

# at 617
#170802 16:51:02 server id 1  end_log_pos 655   GTID 2-1-2 trans
/*!100001 SET @@session.server_id=1*//*!*/;
/*!100001 SET @@session.gtid_seq_no=2*//*!*/;
BEGIN
/*!*/;
# at 655
#170802 16:51:02 server id 1  end_log_pos 699   Table_map: `db1`.`tab` mapped to number 21
# at 699
#170802 16:51:02 server id 1  end_log_pos 738   Write_rows: table id 21 flags: STMT_END_F
 
BINLOG '
NjuCWRMBAAAALAAAALsCAAAAABUAAAAAAAEAA2RiMQADdGFiAAIDDwKWAAI=
NjuCWRcBAAAAJwAAAOICAAAAABUAAAAAAAEAAgP8AQAAAARzdHIx
'/*!*/;
### INSERT INTO `db1`.`tab`
### SET
###   @1=1
###   @2='str1'
# at 738
#170802 16:51:02 server id 1  end_log_pos 765   Xid = 3
COMMIT/*!*/;

And here's the binlog event on another node in cluster2:

# at 576
#170802 16:51:02 server id 1  end_log_pos 614   GTID 2-1-3 trans
/*!100001 SET @@session.server_id=1*//*!*/;
/*!100001 SET @@session.gtid_seq_no=3*//*!*/;
BEGIN
/*!*/;
# at 614
#170802 16:51:02 server id 1  end_log_pos 658   Table_map: `db1`.`tab` mapped to number 20
# at 658
#170802 16:51:02 server id 1  end_log_pos 697   Write_rows: table id 20 flags: STMT_END_F
 
BINLOG '
NjuCWRMBAAAALAAAAJICAAAAABQAAAAAAAEAA2RiMQADdGFiAAIDDwKWAAI=
NjuCWRcBAAAAJwAAALkCAAAAABQAAAAAAAEAAgP8AQAAAARzdHIx
'/*!*/;
### INSERT INTO `db1`.`tab`
### SET
###   @1=1
###   @2='str1'
# at 697
#170802 16:51:02 server id 1  end_log_pos 724   Xid = 3
COMMIT/*!*/;

So the transaction has the expected GTID in cluster1, and it has the expected GTID for the non-slave nodes in cluster2, but it has an unexpected GTID for the slave node in cluster2.



 Comments   
Comment by Andrii Nikitin (Inactive) [ 2017-08-08 ]

I confirm the problem with script below, which sets up two local clusters on (3307,3308) and (3310,3311)
3310 replicates with gtid from 3307
Below are aliases of the nodes :
m1 - 3307
m2 - 3308
m4 - 3310
m5 - 3311

So topology is like below:

(m1,m2)  - cluster "c1"
 ^ gtid master
 |
 V gtid slave
(m4,m5) - cluster "c2"

The script will also download and unpack tar 10.1.25 into _depot/m-tar/10.1.25

set -e
M7VER=10.1.25
 
# just use current directory if called from framework
if [ ! -f common.sh ] ; then
  [ -d mariadb-environs ] || git clone http://github.com/AndriiNikitin/mariadb-environs
  cd mariadb-environs
  ./get_plugin.sh galera
fi
 
echo CLEANING UP ...
c1/cleanup.sh || :
c2/cleanup.sh || :
rm -rf c1/*
rm -rf c2/*
 
echo GENERATE TEMPLATES ...
_template/plant_cluster.sh c1
_template/plant_cluster.sh c2
echo m1 > c1/nodes.lst
echo m2 >> c1/nodes.lst
echo m4 > c2/nodes.lst
echo m5 >> c2/nodes.lst
c1/replant.sh $M7VER
c2/replant.sh $M7VER
 
echo DOWNLOAD IF NEEDED ...
./build_or_download.sh m1
 
echo INSTALL DATA DIRECTORY ...
c1/gen_cnf.sh
c2/gen_cnf.sh
c1/install_db.sh
c2/install_db.sh
 
echo SET UP ACCOUNTS ...
c1/galera_setup_acl.sh
c2/galera_setup_acl.sh
 
echo ENABLE GTID ...
c1/gen_cnf.sh log-bin=qq log_slave_updates=1
c2/gen_cnf.sh log-bin=qq log_slave_updates=1 wsrep_gtid_domain_id=5 gtid_domain_id=5 wsrep_gtid_mode=1
 
echo START CLUSTERS ...
c1/galera_start_new.sh
c2/galera_start_new.sh
 
echo SETUP REPLICATION
m4*/replicate_gtid.sh m1
 
echo PERFORM TRANSACTION
m1*/sql.sh 'create table test.t1(i int)'
 
m1*/sql.sh 'insert into test.t1 select 1'
m2*/sql.sh 'insert into test.t1 select 2'
 
sleep 1
 
m5*/sql.sh 'create table test.t2(a int)'
m1*/sql.sh 'insert into test.t1 select 3'
 
sleep 1
 
echo CHECK DATA
c1/sql.sh 'select * from t1'
c2/sql.sh 'select * from t1'
 
set +e
 
echo GTID in m1:
_depot/m-tar/$M7VER/bin/mysqlbinlog -vvv m1*/dt/qq*.00* | grep -E 'GTID|INSERT|create'
echo GTID in m2:
_depot/m-tar/$M7VER/bin/mysqlbinlog -vvv m2*/dt/qq*.00* | grep -E 'GTID|INSERT|create'
echo GTID in m4:
_depot/m-tar/$M7VER/bin/mysqlbinlog -vvv m4*/dt/qq*.00* | grep -E 'GTID|INSERT|create'
echo GTID in m5:
_depot/m-tar/$M7VER/bin/mysqlbinlog -vvv m5*/dt/qq*.00* | grep -E 'GTID|INSERT|create'
 
c1/sql.sh 'show variables like "gtid_%_pos"'
c2/sql.sh 'show variables like "gtid_%_pos"'

Please note that initially gtid is correct in cluster 2 for two first inserts into both nodes of cluster1:

GTID in m1:
#170808 21:43:45 server id 1  end_log_pos 650   GTID 0-1-3 ddl
create table test.t1(i int)
#170808 21:43:45 server id 1  end_log_pos 778   GTID 0-1-4 trans
### INSERT INTO `test`.`t1`
#170808 21:43:45 server id 2  end_log_pos 918   GTID 0-2-5 trans
### INSERT INTO `test`.`t1`

Direct Slave:

GTID in m4:
#170808 22:06:30 server id 1  end_log_pos 650   GTID 5-1-3 ddl
create table test.t1(i int)
#170808 22:06:30 server id 1  end_log_pos 778   GTID 5-1-4 trans
### INSERT INTO `test`.`t1`
#170808 22:06:30 server id 2  end_log_pos 918   GTID 5-2-5 trans
### INSERT INTO `test`.`t1`

And the other node in cluster2:

GTID in m5:
#170808 22:06:30 server id 1  end_log_pos 634   GTID 5-1-3 ddl
create table test.t1(i int)
#170808 22:06:30 server id 1  end_log_pos 762   GTID 5-1-4 trans
### INSERT INTO `test`.`t1`
#170808 22:06:30 server id 2  end_log_pos 902   GTID 5-2-5 trans

So far so good and gtid works as expected. (5-1-4 and 5-2-5 on both nodes in cluster2). But once we executed some transaction directly in cluster2 - the other node in cluster2 shows different gtid (below 5-1-6 vs 5-1-7):

Master from cluster 1 (m1):

#170808 22:06:31 server id 1  end_log_pos 1058  GTID 0-1-6 trans
### INSERT INTO `test`.`t1`

Direct Slave from cluster2 (m4):

#170808 22:06:31 server id 5  end_log_pos 1058  GTID 5-5-6 ddl
create table test.t2(a int)
#170808 22:06:31 server id 1  end_log_pos 1186  GTID 5-1-6 trans
### INSERT INTO `test`.`t1`

The other node from cluster2 (m5):

#170808 22:06:31 server id 5  end_log_pos 1042  GTID 5-5-6 ddl
create table test.t2(a int)
#170808 22:06:31 server id 1  end_log_pos 1170  GTID 5-1-7 trans
### INSERT INTO `test`.`t1`

And content of gtid pos variables at the end:

m1 :gtid_binlog_pos 0-1-6 gtid_current_pos 0-1-6 gtid_slave_pos
m2 :gtid_binlog_pos 0-1-6 gtid_current_pos gtid_slave_pos
m4 :gtid_binlog_pos 5-1-6 gtid_current_pos 0-1-6 gtid_slave_pos 0-1-6
m5 :gtid_binlog_pos 5-1-7 gtid_current_pos gtid_slave_pos

Comment by Przemek [ 2017-08-10 ]

I can confirm this problem with MariaDB Server 10.1.25, also on simpler case - 1 async master and 2 node Galera cluster, where one of the nodes is async slave.
I was using same gtid_domain_id and wsrep_gtid_domain_id on all nodes, and tried wsrep_gtid_mode=ON and OFF, but problem happens in both.

Initial state - async replication up and running, based on GTID, from MariaDB node to node0. Node0 and node2 are part of the same Galera cluster:

MariaDB [(none)]> show global variables like 'gtid_bi%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| gtid_binlog_pos   |       |
| gtid_binlog_state |       |
+-------------------+-------+
2 rows in set (0.00 sec)
 
node0>  show global variables like 'gtid_b%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| gtid_binlog_pos   |       |
| gtid_binlog_state |       |
+-------------------+-------+
2 rows in set (0.00 sec)
 
node2>  show global variables like '%gtid_b%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| gtid_binlog_pos   |       |
| gtid_binlog_state |       |
+-------------------+-------+
2 rows in set (0.01 sec)

First insert on async master:

MariaDB [(none)]> insert into test.t1 values (null);
Query OK, 1 row affected (0.00 sec)
 
MariaDB [(none)]> show global variables like 'gtid_bi%';
+-------------------+---------+
| Variable_name     | Value   |
+-------------------+---------+
| gtid_binlog_pos   | 0-100-1 |
| gtid_binlog_state | 0-100-1 |
+-------------------+---------+
2 rows in set (0.00 sec)
 
node0>  show global variables like 'gtid_b%';
+-------------------+---------+
| Variable_name     | Value   |
+-------------------+---------+
| gtid_binlog_pos   | 0-100-1 |
| gtid_binlog_state | 0-100-1 |
+-------------------+---------+
2 rows in set (0.00 sec)
 
node2> show variables like 'gtid_b%';
+-------------------+---------+
| Variable_name     | Value   |
+-------------------+---------+
| gtid_binlog_pos   | 0-100-1 |
| gtid_binlog_state | 0-100-1 |
+-------------------+---------+
2 rows in set (0.00 sec)

So far, so good - positions in sync.
Then, insert on second cluster node:

node2> insert into test.t1 values (null);
Query OK, 1 row affected (0.01 sec)
 
node2>  show global variables like '%gtid_b%';
+-------------------+----------------+
| Variable_name     | Value          |
+-------------------+----------------+
| gtid_binlog_pos   | 0-12-2         |
| gtid_binlog_state | 0-100-1,0-12-2 |
+-------------------+----------------+
2 rows in set (0.00 sec)
 
node0>  show global variables like 'gtid_b%';
+-------------------+----------------+
| Variable_name     | Value          |
+-------------------+----------------+
| gtid_binlog_pos   | 0-12-2         |
| gtid_binlog_state | 0-100-1,0-12-2 |
+-------------------+----------------+
2 rows in set (0.00 sec)

Again so far "looks" good, GTID is consistent with the first node.
But let's do another insert on async master:

MariaDB [(none)]> insert into test.t1 values (null);
Query OK, 1 row affected (0.00 sec)
 
MariaDB [(none)]> show global variables like 'gtid_bi%';
+-------------------+---------+
| Variable_name     | Value   |
+-------------------+---------+
| gtid_binlog_pos   | 0-100-2 |
| gtid_binlog_state | 0-100-2 |
+-------------------+---------+
2 rows in set (0.00 sec)
 
node0>  show global variables like 'gtid_b%';
+-------------------+----------------+
| Variable_name     | Value          |
+-------------------+----------------+
| gtid_binlog_pos   | 0-100-2        |
| gtid_binlog_state | 0-12-2,0-100-2 |
+-------------------+----------------+
2 rows in set (0.00 sec)
 
node2>  show global variables like '%gtid_b%';
+-------------------+----------------+
| Variable_name     | Value          |
+-------------------+----------------+
| gtid_binlog_pos   | 0-100-3        |
| gtid_binlog_state | 0-12-2,0-100-3 |
+-------------------+----------------+
2 rows in set (0.00 sec)

And now - for some reason, position on node2 related to async master notation, increased by two! While data is same on both nodes:

node0> select * from test.t1;
+----+
| id |
+----+
|  1 |
|  2 |
|  7 |
+----+
3 rows in set (0.00 sec)
 
node2> select * from test.t1;
+----+
| id |
+----+
|  1 |
|  2 |
|  7 |
+----+
3 rows in set (0.00 sec)

And the more updates we do inside the cluster, the more inconsistencies are introduced in GTID sequences here.
For example, after few more inserts:

node0>  show global variables like 'gtid_b%';
+-------------------+------------------------------+
| Variable_name     | Value                        |
+-------------------+------------------------------+
| gtid_binlog_pos   | 0-10-9                       |
| gtid_binlog_state | 0-100-2,0-12-2,0-11-7,0-10-9 |
+-------------------+------------------------------+
2 rows in set (0.00 sec)
 
node2>  show global variables like '%gtid_b%';
+-------------------+-------------------------------+
| Variable_name     | Value                         |
+-------------------+-------------------------------+
| gtid_binlog_pos   | 0-10-10                       |
| gtid_binlog_state | 0-100-3,0-12-2,0-11-8,0-10-10 |
+-------------------+-------------------------------+
2 rows in set (0.00 sec)

So it seems that when async channel is involved to replicate inside a Galera cluster, even for a brief moment, GTID sequences inside the cluster are getting quite messy.

Comment by Sachin Setiya (Inactive) [ 2017-12-11 ]

Hi przemek@mysqlmaniac.com!

Actually this is also wrong gtid 0-12-2,0-100-2 , But patch for 10715 will fix this issue.

Generated at Thu Feb 08 08:05:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.