[MDEV-31038] Parallel Replication Breaks if XA PREPARE Fails Updating Slave GTID State Created: 2023-04-11 Updated: 2024-01-17 Resolved: 2023-06-12 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1 |
| Fix Version/s: | 11.1.1, 10.11.3, 11.0.2, 10.5.20, 10.6.13, 10.8.8, 10.9.6, 10.10.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Brandon Nesterenko | Assignee: | Vladislav Lesin |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Description |
|
Similar to This is because the XA PREPARE had already committed successfully in binlog and storage engines, and only the slave GTID state update failed. Then on retry, the transaction has already been seen, so we error. Filed as a separate ticket (rather than extending |
| Comments |
| Comment by Brandon Nesterenko [ 2023-04-19 ] | |||||||||||||||||||||
|
Hi Andrei, This is ready for review as PR-2610. | |||||||||||||||||||||
| Comment by Andrei Elkin [ 2023-04-20 ] | |||||||||||||||||||||
|
A review round suggested couple of improvements. | |||||||||||||||||||||
| Comment by Brandon Nesterenko [ 2023-04-25 ] | |||||||||||||||||||||
|
Pushed into 10.5 as 31f09e3 | |||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-25 ] | |||||||||||||||||||||
|
Guys, I think the fix does not actually fixes the root case. From the user perspective it just replaces one error no another. The initial problem can be solved with two sides: 1. Replication side. "XA prepare" binlogging and record inserting to gtid_slave_pos is not atomic. If "XA prepare" is binloged, and for some reason the insert to gtid_slave_pos failed, the binlog and gtid_slave_pos are not consistent. It would be great if binlog and gtid_slave_pos updates were in the same transaction, which could be either committed or rolled back. 2. InnoDB side. If the bug is caused by transaction retry, and there was error during in gtid_slave_pos insert, the error can be either deadlock or timeout(see the code around slave_transaction_retry_errors array and is_parallel_retry_error() call in Xid_apply_log_event::do_record_gtid() for details). Both errors can be caused by InnoDB locks. But when some record is inserted into gtid_slave_pos, persistent cursor in opened with PAGE_CUR_LE search mode, see row_ins_clust_index_entry_low() for details. I.e. it tries to find the record less or equal to the inserted gtid. During the search, btr_pcur_open_low() fills cursor->up_match and cursor->low_match fields. Here is the comments about the fields:
gtid_slave_pos has the following pk:
i.e. if we have the following records in gtid_slave_pos (0, 1, ...), (0, 4, ...), and want to insert (0, 3, ...), both cursor->low_match and cursor->up_match will be equal to 1, as both (0,1, ...), (0,4, ...) has the same 1 field as (0,3, ...). If gtid_slave_pos had contained also (0,3, ...) record, cursor->low_match would be equal to 1, and cursor->up_match would be equal to 2, what would make (cursor->up_match >= n_uniq || cursor->low_match >= n_uniq) condition to be true in row_ins_clust_index_entry_low(), what would cause non-gap shared lock acquiring for (0,3,...) to compare all fields(see row_ins_duplicate_error_in_clust()) and to check 'deleted' flag. Non-gap locks can be converted to gap lock only on record deleting, or page splitting/merging or page discarding. See lock_rec_inherit_to_gap() for details. So if (0,3,...) is delete-marked and purged(or page is split/merged or discarded) after some thread checked it for duplicates, committed mini-transaction, but have not released locks yet, insert (0,2,...) could be blocked. In other words, non-gap S-lock on gtid_slave_pos record can happen only if it contain duplicate(even delete-marked), i.e. the record with the same (`domain_id`,`sub_id`) pair as in the inserting record. As I understood correctly(Elkin, correct me if I am wrong), it's impossible, as (`domain_id`,`sub_id`) is supposed to be unique. In the case of duplicate(even delete-marked) absence only insert-intention locks are acquired(they don't conflict each other). Insert intention locks are not converted(see lock_rec_inherit_to_gap(), lock_rec_inherit_to_gap_if_gap_lock()). Insert-intention locks conflict only with gap locks. But, to make it happen, there must be duplicates in gtid_slave_pos, what is supposed to be impossible. I would add printing transaction locks info (i.e. what locks the transaction holds, in what lock it's deadlocked or timed out, the locks of conflicting transaction) to understand if the insert failure was caused by locks, and by what exactly locks. Maybe it also makes sense to print the statement of conflicting transaction, as, theoretically, the locks can be caused not only by slave workers, but also by some third-party queries. | |||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-25 ] | |||||||||||||||||||||
|
Here is my try to cause "out-of-order" error: rpl_parallel_optimistic.test I launched it with 30 instanced, it worked about 8 hours and did not catch the bug. | |||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-04-28 ] | |||||||||||||||||||||
|
The rpl_parallel_optimistic.test | |||||||||||||||||||||
| Comment by Brandon Nesterenko [ 2023-06-12 ] | |||||||||||||||||||||
|
Closing issue as there was a patch out into a release. Any further work on the corresponding crash-safety concern should be tracked to MDEV-21777. | |||||||||||||||||||||
| Comment by Andrei Elkin [ 2023-09-21 ] | |||||||||||||||||||||
|
Correcting the relation after discussion with vlad.lesin. |