Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
Description
When a transaction run with a side effect, like creating a temporary table, is chosen
as the deadlock victim it must be binlogged with ROLLBACK as the terminal event.
While this indeed takes place its logging may be done after a transaction that was
waiting for the victim's locks.
When the two such transactions are replicated the first to run on slave would be the victor,
and it may commit to cause various errors at the victim's statements replay.
For example, here is a deadlock in two transactions running updates on
[test]> show create table t2\g
|
+-------+------------------------------------------------------------------------------------------------------------------------------------+
|
| Table | Create Table |
|
+-------+------------------------------------------------------------------------------------------------------------------------------------+
|
| t2 | CREATE TABLE `t2` (
|
`a` int(11) NOT NULL,
|
`b` int(11) DEFAULT NULL,
|
PRIMARY KEY (`a`)
|
) ENGINE=InnoDB
|
--transaction GTID 0-1-2
|
update t2 set b=111 where a=1; |
--transaction GTID 0-1-3
|
update t2 set b=22 where a=2; |
--transaction GTID 0-1-2
|
update t2 set b=111 where a=2; |
--transaction GTID 0-1-3
|
#--error *deadlock* |
--send update t2 set b=22 where a=1;
|
|
# and while the deadlock is being handled the victor can commit |
--transaction GTID 0-1-2
|
COMMIT; |
# => Committed trx is binlogged |
|
--reap GTID 0-1-3
|
# => ROLLBACKed transaction is binlogged |
|
That's how binlog can look:
21:33:05 [test]> show binlog events;
|
+-------------------+------+-------------------+-----------+-------------+-------------------------------------------------------------------+
|
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
|
+-------------------+------+-------------------+-----------+-------------+-------------------------------------------------------------------+
|
| master-bin.000001 | 4 | Format_desc | 1 | 249 | Server ver: 10.1.42-MariaDB-debug, Binlog ver: 4 |
|
| master-bin.000001 | 249 | Gtid_list | 1 | 274 | [] |
|
| master-bin.000001 | 274 | Binlog_checkpoint | 1 | 314 | master-bin.000001 |
|
...
|
| master-bin.000001 | 587 | Gtid | 1 | 625 | BEGIN GTID 0-1-2 |
|
| master-bin.000001 | 625 | Query | 1 | 733 | use `test`; insert into t1 values (10,10),(11,11),(12,12) |
|
| master-bin.000001 | 733 | Query | 1 | 824 | use `test`; update t2 set b=11 where a=1 |
|
| master-bin.000001 | 824 | Query | 1 | 916 | use `test`; update t2 set b=111 where a=1 |
|
| master-bin.000001 | 916 | Query | 1 | 1008 | use `test`; update t2 set b=111 where a=2 |
|
| master-bin.000001 | 1008 | Xid | 1 | 1035 | COMMIT /* xid=148 */ |
|
| master-bin.000001 | 1035 | Gtid | 1 | 1073 | BEGIN GTID 0-1-3 |
|
| master-bin.000001 | 1073 | Query | 1 | 1164 | use `test`; update t2 set b=22 where a=2 |
|
| master-bin.000001 | 1164 | Query | 1 | 1276 | use `test`; create temporary table tt_1 (a int) engine=innodb |
|
| master-bin.000001 | 1276 | Query | 1 | 1347 | ROLLBACK
|
It's clear that the GTID 0-1-3's
| master-bin.000001 | 1073 | Query | 1 | 1164 | use `test`; update t2 set b=22 where a=2
|
will face the duplicate key error on slave.
While the reason of the error is incorrect order of the transactions in binlog, which must be caused by eager release of the victim's locks to the victor transaction (before the victim actually has passed through innobase_rollback()), the replication issue alone can be fixed by marking the rollback transaction and with the following its relaxed execution on the slave to ignore idempotent class of errors, in both binlog formats.
Attachments
Issue Links
- relates to
-
MDEV-20478 interupted transactions in mixed mode with transactional and non-transactional tables break replication
- Stalled