[MDEV-21777] Implement crash-safe execution the user XA on binlog-less slave Created: 2020-02-19  Updated: 2023-10-18

Status: Open
Project: MariaDB Server
Component/s: None
Fix Version/s: None

Type: Epic Priority: Major
Reporter: Andrei Elkin Assignee: Sergei Golubchik
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File MDEV-21777.xa_slave_recovery.txt    
Issue Links:
Issue split
split from MDEV-21469 Implement crash-safe logging of the u... Stalled
Relates
relates to MDEV-742 LP:803649 - Xa recovery failed on cli... Closed
relates to MDEV-31038 Parallel Replication Breaks if XA PRE... Closed

 Description   

This task is ensued by MDEV-742 and implements the 2nd part of the upstream's
Bug#76233, see MDEV-21469 for the 1st part.

XA Commit, Rollback and XA PREPARE execution on slave involves mysql.gtid_slave_pos update to indicate the slave execution status.
Unlike the regular transaction, that update can't be executed within the replicated XA.

Regardless of whether binlog is ON or FF, it has to be processed as a separate transaction to be two-phase-committed with the replicated one.
Specifically, in the XA-PREPARE case the mysql.gtid_slave_pos transaction gains a special xid that 'matches' the xid of the user one. It prepares "in parallel" with the replicated XA and when the latter prepare OKs, it commits.
In case of a crash in between recovery will search for all user prepared trx:s, and try to match
their xid:s. mysql.gtid_slave_pos trx that matches a user XA allows for accounting the user XA prepare as successful and mysql.gtid_slave_pos one then commits.

Similar method is applied to XA-COMMIT,ROLLBACK.



 Comments   
Comment by Andrei Elkin [ 2020-02-24 ]

Updated the design sketch after deeper sources code analysis.

Generated at Thu Feb 08 09:09:42 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.