[MDEV-31998] XA PREPARE should do binlog_prepare last Created: 2023-08-24  Updated: 2023-08-25

Status: Open
Project: MariaDB Server
Component/s: XA
Affects Version/s: 10.11.4
Fix Version/s: 10.11

Type: Bug Priority: Major
Reporter: David Zhao Assignee: Andrei Elkin
Resolution: Unresolved Votes: 0
Labels: None
Environment:

linux; generic


Issue Links:
Relates
relates to MDEV-21469 Implement crash-safe logging of the u... Stalled

 Description   

in XA PREPARE execution, each storage engine involved in the XA transaction is prepared, but the order they are prepared is crucial and currently the order is wrong — currently binlog_prepare() is called before any other storage engine, but it should have been called after all other storage engines like innodb, rocksdb, etc.

The reason is that:

suppose there is a replication group, the master node replicates binlogs to the slave node, when the binlog of such an XA txn has been written to master node's binlog file and transmitted to slave node and before innodb has done innobase_xa_prepare(), the master node is down, then at recovery innodb will abort the XA txn, but slave node has replayed the XA txn's binlogs, causing inconsistency between the master and the slave nodes.



 Comments   
Comment by David Zhao [ 2023-08-24 ]

Also note that, in XA COMMIT and XA ROLLBACK execution, binlog should be written before committing/aborting in any other storage engines, so current implementation is correct.

Comment by Andrei Elkin [ 2023-08-24 ]

julien.fritsch, yes it relates. But we don't have to go along that design. I am replying to the reporter.

Comment by Andrei Elkin [ 2023-08-24 ]

DZW, thanks for your analysis! Actually the reason binlogging of XA-prepare done first is MDEV-21469 recovery that is
under completion. In your scenario the crash master would re-apply the prepared XA's replication events at its recovery.

Generated at Thu Feb 08 10:28:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.