Details
-
Bug
-
Status: Open (View Workflow)
-
Critical
-
Resolution: Unresolved
-
10.5.2
-
None
-
None
Description
XA changes done in 10.5 introduces a regression that breaks replication.
The problem is that the slave now applies XA transactions while replicating
an XA_prepare_log_event binlogged by "XA PREPARE" on the master. This is
wrong, the transaction must not be applied on the slave until "XA COMMIT",
as it is done correctly in 10.4.
Applying the XA PREPARE on the slave leaves dangling InnoDB row locks that
can conflict with the replication of later transactions and cause
replication to break. The below test case (also attached) demonstrates one
simple instance of this.
Another problem is that splitting a transaction in this way in the binlog
means there is no longer a unique binlog position corresponding to the
database state. This is demonstrated by the attached testcase
rpl_xa_provision.test .
This test case takes a mysqldump while an XA PREPARED transaction is active
on the master, and uses it to provision a new slave. The new slave's GTID
position cannot be set correctly. Setting it after the XA PREPARED
transaction means the XA COMMIT will fail. But setting it before the XA
PREPARE would also not be correct, as it would duplicate transactions
binlogged after the XA PREPARE. Thus, in 10.5, the provisioned slave breaks
its replication.
The fix is to revert the change so that XA transactions are applied on the
slave only as part of the XA COMMIT event. When the XA PREPARE event is
received by the slave, it must not be applied. Instead it can be saved
somewhere (there are several possible designs). In case of a master crash
and the slave is promoted as the new master, those saved XA PREPAREd events
can then be used to recover the XA transaction into the prepared state for
the application to XA COMMIT or XA ROLLBACK.
--source include/have_innodb.inc
|
--source include/have_binlog_format_row.inc
|
--source include/master-slave.inc
|
|
--connection master
|
|
CREATE TABLE t1 (a int, b int, c int,
|
INDEX i1(a),
|
INDEX i2(b))
|
ENGINE=InnoDB;
|
|
INSERT INTO t1 VALUES
|
(1,1,0), (1,2,0),
|
(2,1,0), (2,2,0);
|
--sync_slave_with_master
|
|
--source include/stop_slave.inc
|
SET @old_timeout= @@GLOBAL.innodb_lock_wait_timeout;
|
SET @old_retries= @@GLOBAL.slave_transaction_retries;
|
SET GLOBAL innodb_lock_wait_timeout= 2;
|
SET GLOBAL slave_transaction_retries= 3;
|
--source include/start_slave.inc
|
|
--connection master
|
XA START "t1";
|
UPDATE t1 FORCE INDEX (i2) SET c=c+1 WHERE a=1 AND b=1;
|
XA END "t1";
|
XA PREPARE "t1";
|
|
--connection master1
|
XA START "t2";
|
UPDATE t1 FORCE INDEX (i2) SET c=c+1 WHERE a=1 AND b=2;
|
XA END "t2";
|
XA PREPARE "t2";
|
|
--connection master
|
XA COMMIT "t1";
|
|
--connection master1
|
XA COMMIT "t2";
|
|
--connection master
|
SELECT * FROM t1 ORDER BY a,b,c;
|
|
--sync_slave_with_master
|
SELECT * FROM t1 ORDER BY a,b,c;
|
|
# Cleanup
|
--connection master
|
DROP TABLE t1;
|
|
--connection slave
|
SET GLOBAL innodb_lock_wait_timeout= @old_timeout;
|
SET GLOBAL slave_transaction_retries= @old_retries;
|
|
--source include/rpl_end.inc
|
Attachments
Issue Links
- blocks
-
MDEV-32014 When binlogging enabled, committing a large transaction will freeze all other transactions until completed
-
- Open
-
- relates to
-
MDEV-742 LP:803649 - Xa recovery failed on client disconnection
-
- Closed
-