[MDEV-9145] Make --gtid-ignore-duplicates work for user transactions also Created: 2015-11-17  Updated: 2023-07-12

Status: Open
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.0.21
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Thomas Jassmann Assignee: Kristian Nielsen
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Centos 7.1



 Description   

I have two Mariadb 10.0.21 servers replicating from each others. log_slave_updates
is enabled everywhere to have full log information on each server.
A third Mariadb 10.0.21 server shall provide some filtered mysqlbinlog output to server1 to replicate it to server 2.
Without log_slave_updates on server 2 everything works fine.

A Table like

CREATE TABLE `Table1` (
`PKey` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`Wert1` INT(11) NOT NULL,
`Text1` VARCHAR(50) NOT NULL COLLATE 'utf8_bin',
`ConfCol` INT(11) NOT NULL DEFAULT '0',
`server_id` INT(11) NOT NULL DEFAULT 0,
`local_id` INT(11) NOT NULL DEFAULT 1,
`group_id` INT(11) NOT NULL DEFAULT 1,
PRIMARY KEY (`PKey`)
)

is identical on all servers, a dataset with PKey=104 doesn't exist on each server.

on server 1 with domainid=1 and serverid=1 before the error:

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.146
                  Master_User: replu
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql.000056
          Read_Master_Log_Pos: 788
               Relay_Log_File: cent7_1-relay-bin.000002
                Relay_Log_Pos: 888
        Relay_Master_Log_File: mysql.000056
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 788
              Relay_Log_Space: 1187
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Current_Pos
                  Gtid_IO_Pos: 1-1-43748,2-2-58,4-4-72
1 row in set (0.00 sec)
 

on server 2 with domainid=2 and serverid=2 before the error:

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.145
                  Master_User: replu
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql.000083
          Read_Master_Log_Pos: 804
               Relay_Log_File: cent7_2-relay-bin.000002
                Relay_Log_Pos: 1087
        Relay_Master_Log_File: mysql.000083
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 804
              Relay_Log_Space: 1386
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Current_Pos
                  Gtid_IO_Pos: 1-1-43748,2-2-58,4-4-72
1 row in set (0.00 sec)

Inserting a dataset to the third server with domainid 4 and serverid 4 and pointing mysqlbinlog to the correct file and pos coordinates results in the output:

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 13442
#151117 11:47:23 server id 4  end_log_pos 0     Start: binlog v 4, server v 10.0.21-MariaDB-log created 151117 11:47:23
# at 13442
#151117 18:19:57 server id 4  end_log_pos 13480         GTID 4-4-73
/*!100001 SET @@session.gtid_domain_id=4*//*!*/;
/*!100001 SET @@session.server_id=4*//*!*/;
/*!100001 SET @@session.gtid_seq_no=73*//*!*/;
BEGIN
/*!*/;
# at 13480
# at 13534
#151117 18:19:57 server id 4  end_log_pos 13534         Table_map: `Test1`.`Table1` mapped to number 97
#151117 18:19:57 server id 4  end_log_pos 13598         Write_rows: table id 97 flags: STMT_END_F
### INSERT INTO `Test1`.`Table1`
### SET
###   @1=104
###   @2=7
###   @3='HalloDuDa'
###   @4=8
###   @5=0
###   @6=1
###   @7=1
# at 13598
#151117 18:19:57 server id 4  end_log_pos 13625         Xid = 814
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

parsed and applied on server 1 like :

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/
/*!40019 SET @@session.max_insert_delayed_threads=0*/
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/
/*!100001 SET @@session.gtid_domain_id=4*/
/*!100001 SET @@session.server_id=4*/
/*!100001 SET @@session.gtid_seq_no=73*/
START TRANSACTION
INSERT INTO `Test1`.`Table1` VALUES (104,7,'HalloDuDa',8,2,1,1)
COMMIT
ROLLBACK /* added by mysqlbinlog */
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/

breaks replication on server 1:

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.146
                  Master_User: replu
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql.000056
          Read_Master_Log_Pos: 971
               Relay_Log_File: cent7_1-relay-bin.000002
                Relay_Log_Pos: 888
        Relay_Master_Log_File: mysql.000056
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1062
                   Last_Error: Could not execute Write_rows_v1 event on table Test1.Table1; Duplicate entry '104' for key 'PRIMAR   Y', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql.000056, end_log_pos 944
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 788
              Relay_Log_Space: 1370
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1062
               Last_SQL_Error: Could not execute Write_rows_v1 event on table Test1.Table1; Duplicate entry '104' for key 'PRIMAR   Y', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql.000056, end_log_pos 944
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Current_Pos
                  Gtid_IO_Pos: 1-1-43748,2-2-58,4-4-73
1 row in set (0.00 sec)

server 2 is okay:

MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.145
                  Master_User: replu
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql.000083
          Read_Master_Log_Pos: 987
               Relay_Log_File: cent7_2-relay-bin.000002
                Relay_Log_Pos: 1270
        Relay_Master_Log_File: mysql.000083
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 987
              Relay_Log_Space: 1569
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: Current_Pos
                  Gtid_IO_Pos: 1-1-43748,2-2-58,4-4-73

Both servers applied the Insert 4-4-73. gtid_ignore_duplicates=ON doesn't solve the problem. gtid_strict_mode=OFF.

Pointing Replicate_Ignore_Server_Ids to 4 helps but isn't an option since server 1 and
server 2 are part of a failover scenario.

Why does server 1 try to reapply the dataset from server 2 (log_slave_updates=ON)?



 Comments   
Comment by Elena Stepanova [ 2015-11-19 ]

Sorry, I don't understand from the description where the 3rd server stands in your replication topology.
server1 and server2 replicate from each other, okay. But

A third Mariadb 10.0.21 server shall provide some filtered mysqlbinlog output to server1 to replicate it to server 2.

what does it mean in terms of replication? Which replicates from which, and how? And what does

parsed and applied on server 1 like :

mean – parsed and applied by whom/what? How is it related to server3?

Please also attach your cnf files for all servers.

Comment by Thomas Jassmann [ 2015-11-19 ]

The 3rd server is an additional master, that means, it receives inserts and updates, but is not connected to server1 and server2 via mysql slave replication threads. Information from server3 is extracted from it's binary logs by mysqlbinlog wich contain valid gtid_domain_id, server_id and for all other servers new gtid_seqno. The resulting sql-statments then were injected by a C-Programm using libmysqlclient_r.so.18 in one single connection to server1.

For me, it looks like the (update of the) value, the mysql process uses to decide wether a gtid is duplicate or not is handled different in the replication slave thread on the one and in a standard user connection on the other hand.

config of server1:

# this is only for the mysqld standalone daemon
[mysqld]
 
log-error=/var/lib/mysql/mysqld.log
 
# this is only for embedded server
[embedded]
 
# This group is only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
 
server-id=1
 
relay-log=cent7_1-relay-bin
 
log-bin=mysql.bin
sync_binlog = 1
max_allowed_packet=52m
default_storage_engine = InnoDB
innodb_flush_log_at_trx_commit = 2
innodb_flush_method=O_DIRECT
binlog-format = row
log_slave_updates=1
 
gtid_ignore_duplicates=ON
 
rpl_semi_sync_master_enabled=1
rpl_semi_sync_slave_enabled=1
 
auto-increment-offset    = 1
auto-increment-increment = 100
 
innodb_file_per_table=1
 
innodb_flush_log_at_trx_commit=2
 
gtid-domain-id=1
 

config of server 2:

# this is only for the mysqld standalone daemon
[mysqld]
 
# this is only for embedded server
[embedded]
 
# This group is only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
 
server-id=2
 
log-bin=mysql.bin
sync_binlog = 1
max_allowed_packet=52m
default_storage_engine = InnoDB
innodb_flush_log_at_trx_commit = 2
innodb_flush_method=O_DIRECT
binlog-format = row
log_slave_updates=1
 
gtid_ignore_duplicates=ON
 
rpl_semi_sync_master_enabled=1
rpl_semi_sync_slave_enabled=1
 
auto-increment-offset    = 2
auto-increment-increment = 100
 
innodb_file_per_table=1
 
innodb_flush_log_at_trx_commit=2
 
gtid-domain-id=2
 

config of server 3:

# This group is only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
 
server-id=4
 
log-bin=mysql.bin
sync_binlog = 1
max_allowed_packet=52m
default_storage_engine = InnoDB
innodb_flush_log_at_trx_commit = 2
innodb_flush_method=O_DIRECT
binlog-format = row
log_slave_updates=1
 
auto-increment-offset    = 4
auto-increment-increment = 100
 
innodb_file_per_table=1
 
innodb_flush_log_at_trx_commit=2
 
gtid-domain-id=4
 

Comment by Thomas Jassmann [ 2015-11-20 ]

What kind of feedback do you need ?

Comment by Kristian Nielsen [ 2016-01-15 ]

It is a bit hard to follow, but my guess is that the following is happening.

A replication ring is set up between server1 (server_id=1) and server2
(server_id=2) (each server replicates from the other).

Now a transaction is injected on server1 with server_id explicitly set to 4.

This causes the transaction to replicate to server2, and then back to
server1. When Replication tries to apply the transaction again on server1,
it fails with a duplicate key error.

The normal --replicate-ignore-server-ids=0 protection against re-apply of a
transaction on server1 does not activate here, because the transaction was
done explicitly setting server_id=4.

The --gtid-ignore-duplicate does not activate because it only protects
against replicating the same transaction twice. In this case, the
transaction is only replicated once, the first time it was executed directly
on the server.

Basically, for a ring topology to work, transactions originating on a server
needs to have the correct server_id for --replicate-same-server-id to work,
or the endless loop needs to be broken some other way like
--replicate-ignore-server-ids=0. I think (not sure) that it is possible to
configure server1 to ignore server_id=4 from server2, but not from other
servers.

Or the server_id could be set to 1 when the transaction is originally
injected on server1, perhaps.

Comment by Elena Stepanova [ 2016-01-19 ]

TomJass, does the comment above explain for you results that you are getting, and ways to avoid the problem?

Comment by Thomas Jassmann [ 2016-01-20 ]

I'm able to avoid the problem. I think this comes nearest to Kristian's last proposal. At the moment I'm injecting the mysqlbinlog output replication data without explictliy setting session.gtid_domain_id, session.server_id and session.gtid_seq_no as generated by the originating server. The disadvantage is, the binlogs of the different servers contain now different id's for the same transactions, the real gtid of the transaction is lost. Is there a plan by you and your team to treat session.gtid_domain_id, session.server_id and session.gtid_seq_no set in a user session the same like in the replication threads ?

Comment by Elena Stepanova [ 2016-01-20 ]

For discussion on further plans and such, assigning to Kristian.

Generated at Thu Feb 08 07:32:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.