Guys, I think the fix does not actually fixes the root case. From the user perspective it just replaces one error no another.
The initial problem can be solved with two sides:
1. Replication side. "XA prepare" binlogging and record inserting to gtid_slave_pos is not atomic. If "XA prepare" is binloged, and for some reason the insert to gtid_slave_pos failed, the binlog and gtid_slave_pos are not consistent. It would be great if binlog and gtid_slave_pos updates were in the same transaction, which could be either committed or rolled back.
2. InnoDB side.
If the bug is caused by transaction retry, and there was error during in gtid_slave_pos insert, the error can be either deadlock or timeout(see the code around slave_transaction_retry_errors array and is_parallel_retry_error() call in Xid_apply_log_event::do_record_gtid() for details). Both errors can be caused by InnoDB locks.
But when some record is inserted into gtid_slave_pos, persistent cursor in opened with PAGE_CUR_LE search mode, see row_ins_clust_index_entry_low() for details. I.e. it tries to find the record less or equal to the inserted gtid. During the search, btr_pcur_open_low() fills cursor->up_match and cursor->low_match fields. Here is the comments about the fields:
ulint up_match; /*!< If the search mode was PAGE_CUR_LE,
|
the number of matched fields to the
|
the first user record to the right of
|
the cursor record after
|
btr_cur_search_to_nth_level;
|
... */
|
ulint low_match; /*!< if search mode was PAGE_CUR_LE,
|
the number of matched fields to the
|
first user record AT THE CURSOR or
|
to the left of it after
|
btr_cur_search_to_nth_level;
|
...*/
|
gtid_slave_pos has the following pk:
*************************** 1. row ***************************
|
Table: gtid_slave_pos
|
Create Table: CREATE TABLE `gtid_slave_pos` (
|
`domain_id` int(10) unsigned NOT NULL,
|
`sub_id` bigint(20) unsigned NOT NULL,
|
`server_id` int(10) unsigned NOT NULL,
|
`seq_no` bigint(20) unsigned NOT NULL,
|
PRIMARY KEY (`domain_id`,`sub_id`)
|
) ENGINE=InnoDB DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1 TRANSACTIONAL=0 COMMENT='Replication slave GTID position'
|
i.e. if we have the following records in gtid_slave_pos (0, 1, ...), (0, 4, ...), and want to insert (0, 3, ...), both cursor->low_match and cursor->up_match will be equal to 1, as both (0,1, ...), (0,4, ...) has the same 1 field as (0,3, ...). If gtid_slave_pos had contained also (0,3, ...) record, cursor->low_match would be equal to 1, and cursor->up_match would be equal to 2, what would make (cursor->up_match >= n_uniq || cursor->low_match >= n_uniq) condition to be true in row_ins_clust_index_entry_low(), what would cause non-gap shared lock acquiring for (0,3,...) to compare all fields(see row_ins_duplicate_error_in_clust()) and to check 'deleted' flag. Non-gap locks can be converted to gap lock only on record deleting, or page splitting/merging or page discarding. See lock_rec_inherit_to_gap() for details. So if (0,3,...) is delete-marked and purged(or page is split/merged or discarded) after some thread checked it for duplicates, committed mini-transaction, but have not released locks yet, insert (0,2,...) could be blocked.
In other words, non-gap S-lock on gtid_slave_pos record can happen only if it contain duplicate(even delete-marked), i.e. the record with the same (`domain_id`,`sub_id`) pair as in the inserting record. As I understood correctly(Elkin, correct me if I am wrong), it's impossible, as (`domain_id`,`sub_id`) is supposed to be unique.
In the case of duplicate(even delete-marked) absence only insert-intention locks are acquired(they don't conflict each other). Insert intention locks are not converted(see lock_rec_inherit_to_gap(), lock_rec_inherit_to_gap_if_gap_lock()). Insert-intention locks conflict only with gap locks. But, to make it happen, there must be duplicates in gtid_slave_pos, what is supposed to be impossible.
I would add printing transaction locks info (i.e. what locks the transaction holds, in what lock it's deadlocked or timed out, the locks of conflicting transaction) to understand if the insert failure was caused by locks, and by what exactly locks. Maybe it also makes sense to print the statement of conflicting transaction, as, theoretically, the locks can be caused not only by slave workers, but also by some third-party queries.
Hi Andrei,
This is ready for review as PR-2610.