[MDEV-30794] port --rollback-xa of MDEV-21168 fixes to 10.5+ Created: 2023-03-06  Updated: 2023-05-30  Resolved: 2023-05-30

Status: Closed
Project: MariaDB Server
Component/s: Backup
Affects Version/s: 10.5, 10.6, 10.8, 10.9, 10.10, 10.11
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Andrei Elkin Assignee: Valerii Kravchuk
Resolution: Incomplete Votes: 1
Labels: None

Issue Links:
Issue split
split from MDEV-21168 Active XA transactions stop slave fro... Closed
Relates

 Description   

--rollback-xa of MDEV-21168 may be needed when
the backup image is planned to be restored on a general non-slave server. That server would not be going to communicate with
the backup donor so can not find out automatically commit-or-rollback decisions for prepared user xa:s.



 Comments   
Comment by Vladislav Lesin [ 2023-03-15 ]

As I remember from our long discussions during MDEV-742 and MDEV-21168 implementation, the overall logic is the following. If restored backup is used to start slave, then MDEV-742 fix should solve the issue for 10.5+, as binlog, received from master, should contain information about what to do with prepared XA's. If we restore backup not for slave starting, then we use --tc-heuristic-recover. This follows the logic that we don't rollback transactions on "mariabackup --prepare" execution. Maybe we should mention this somewhere in "mariabackup --help" to avoid users confusing. So, I think, there is no need to implement MDEV-30794.

Comment by Andrei Elkin [ 2023-03-15 ]

valerii, considering more elaboration on slack, it looks this ticket is not for the customer trouble. I suggest we'd review their backup procedure and maybe analyzed the backup image itself to explain possible non-user-xa prepared trx in there.

Comment by Vladislav Lesin [ 2023-04-21 ]

As we discussed it in the slack thread I referred above, non-explicit(non-user, "normal", whatever, i.e., the XA's, which were started explicitly with 'XA START' statement) XA prepare and commit must be protected with MDL_BACKUP_COMMIT lock in ha_commit_trans(). What means there must no be prepared and non-committed such XA's in InnoDB redo log. If we see such XA's there, that means something went wrong during backup process. And we need to find out what exactly went wrong. It's expected that after we find out and fix it, there will not be prepared and uncommitted non-explicit XA's in InnoDB redo log for, at least, 10.6+, and --rollback-xa will not be needed.

Currently, if such non-explicit XA prevents server from starting normally, there is server option --tc-heuristic-recover=rollback, which rolls back prepared XA's on server start. Our customers could use this option. To make it more automation-friendly, we could parse-out such XA's during --prepare(or/and even during --backup) from InnoDB redo log, and notify mariabackup users with special messages in backup log.

The question to support team is the following. pandi.gurusamy, could you please clarify, if our customers insist on this issue implementing, or some workaround with --tc-heuristic-recover=rollback is fine for them?

Generated at Thu Feb 08 10:18:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.