Details
-
New Feature
-
Status: In Progress (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
Description
This ticket covers XA binlog-based crash-recovery to base on and complement MDEV-32830/MDEV-31949 patch.
MDEV-32830 refines XA prepare binlogging in that the XA engine branches get prepared first.
The recovery decision follows the normal transaction case flow in pp N1,N2:
- N1. when at the server recovery a xid exists in both Engines and binlog has recorded an XA completion operation, the xa transaction gets completed;
- N2. when a xid exists only in Engine(s)'s persistent memory, the xa transaction is rolled back.
An XA specific rule is added
- X3. when both contain xid in the prepared state, nothing is done to the transaction, it remains prepared;
To resolve a dilemma of whether such "orphan" (engine-only) XID did indeed missed binlogging on the eve of crash, or it was prepared some time ago (maybe in a previous server incarnation) a Xid_log_list_event is introduced to contain xid:s of prepared-and-binlogged user xa:s at time of binlog rotation (including one that is caused by RESET MASTER).
Collecting of the binlogged-prepared xid:s into Xid_log_list_event must guarantee the event contains all of such xid:s, as well as it any of them gets committed/rolled-back, the corresponding binlog event will be logged after Xid_log_list_event.
The new event therefore helps to maintain p.X3 in situations of purged binlog files that otherwise would contain XA_prepare_log_event of that xid.
This algorithm must comply with MDEV-21117 semisync slave recovery option.
Attachments
Issue Links
- relates to
-
MDEV-31949 slow parallel replication of user xa
- Stalled