Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
11.8
-
None
-
Can result in unexpected behaviour
Description
The problem was seen in an RQG run, with a transaction containing these statements:
XA BEGIN 'xid35';
|
USE dbt3_db;
|
SELECT MAX( l_receiptdate ) FROM lineitem WHERE ( l_shipdate != '1992-09-07' OR l_partkey BETWEEN 55 AND 6 + 209 ) AND ( l_orderkey BETWEEN 860 AND 1203 + 83 OR l_suppkey BETWEEN 0 AND 10 + 4 ) AND ( l_shipdate IN ( '1998-07-07', '1994-09-15' ) OR l_receiptDATE BETWEEN '1998-05-05' AND '1998-03-12' ) AND ( l_linenumber = 0 );
|
...
|
/* WRK-6 QNO 2875 */ PREPARE /* TRANSFORM_SETUP */ stmt_ExecuteAsPS_3346080 FROM ' SELECT alias1.`col_varchar_1024_utf8` AS field1, alias2.`col_date` AS field2 FROM F AS alias1 LEFT JOIN A AS alias2 ON alias1.`col_varchar_10_latin1` = alias2.`col_varchar_10_utf8_key` RIGHT OUTER JOIN A AS alias3 ON alias1.`col_varchar_1024_utf8_key` = alias3.`col_varchar_1024_latin1_key` WHERE alias2.`pk` >= ? AND alias2.pk < ( ? + ? ) OR alias1.`col_int_key` > ? AND alias1.col_int_key < ( ? + ? ) AND alias3.`col_int_key` NOT BETWEEN ? AND (? + ? ) AND alias3.`col_int` > ? AND alias3.col_int <= ( ? + ? ) AND alias2.`col_int_key` IS NULL AND alias3.`col_varchar_1024_utf8` > ? AND alias3.col_varchar_1024_utf8 < ? OR alias3.`col_int` != ? OR alias3.`col_int_key` >= ? AND alias3.col_int_key <= ( ? + ? ) OR alias3.`col_int_key` > ? AND alias3.col_int_key <= ( ? + ? ) AND alias1.`col_varchar_10_utf8_key` < alias2.`col_varchar_10_latin1_key` ORDER BY field1 ASC, field1, field2 ASC, field1, field2 ';
|
...
|
UPDATE IGNORE `dbt3_db`.`lineitem` SET `l_shipinstruct` = DEFAULT ORDER BY `l_receiptDATE` LIMIT 4;
|
The PREPARE stmt_ExecuteAsPS_3346080 query fails with an InnoDB deadlock error:
# 2025-10-24T20:27:56 [3346080] WRK-6: STATUS_RUNTIME_ERROR: 1213 "Deadlock found when trying to get lock; try restarting transaction". Further errors 1213 will be suppressed. [/* WRK-6 QNO 2875 */ PREPARE /* TRANSFORM_SETUP */ stmt_ExecuteAsPS_3346080 FROM ' SELECT alias1.`col_varchar_1024_utf8` AS field1, alias2.`col_date` AS field2 FROM F AS alias1 LEFT JOIN A AS alias2 ON alias1.`col_varchar_10_latin1` = alias2.`col_varchar_10_utf8_key` RIGHT OUTER JOIN A AS alias3 ON alias1.`col_varchar_1024_utf8_key` = alias3.`col_varchar_1024_latin1_key` WHERE alias2.`pk` >= ? AND alias2.pk < ( ? + ? ) OR alias1.`col_int_key` > ? AND alias1.col_int_key < ( ? + ? ) AND alias3.`col_int_key` NOT BETWEEN ? AND (? + ? ) AND alias3.`col_int` > ? AND alias3.col_int <= ( ? + ? ) AND alias2.`col_int_key` IS NULL AND alias3.`col_varchar_1024_utf8` > ? AND alias3.col_varchar_1024_utf8 < ? OR alias3.`col_int` != ? OR alias3.`col_int_key` >= ? AND alias3.col_int_key <= ( ? + ? ) OR alias3.`col_int_key` > ? AND alias3.col_int_key <= ( ? + ? ) AND alias1.`col_varchar_10_utf8_key` < alias2.`col_varchar_10_latin1_key` ORDER BY field1 ASC, field1, field2 ASC, field1, field2 ']
|
And then later the UPDATE IGNORE asserts "!thd->transaction->xid_state.is_explicit_XA() || thd->lex->xa_opt == XA_ONE_PHASE":
#9 0x00007f7801653eb2 in __GI___assert_fail (assertion=0x558fa3adbf40 "!thd->transaction->xid_state.is_explicit_XA() || thd->lex->xa_opt == XA_ONE_PHASE", file=0x558fa3ad8740 "/data/bld/preview-12.3/knielsen_binlog_in_engine_11.8-asan/sql/handler.cc", line=2038, function=0x558fa3adbb40 "int ha_commit_trans(THD*, bool)") at ./assert/assert.c:101
|
#10 0x0000558fa1a8ebb3 in ha_commit_trans (thd=0x62c000320218, all=false) at /data/bld/preview-12.3/knielsen_binlog_in_engine_11.8-asan/sql/handler.cc:2038
|
#11 0x0000558fa161c2bb in trans_commit_stmt (thd=0x62c000320218) at /data/bld/preview-12.3/knielsen_binlog_in_engine_11.8-asan/sql/transaction.cc:496
|
#12 0x0000558fa10ecc95 in mysql_execute_command (thd=0x62c000320218, is_called_from_prepared_stmt=false) at /data/bld/preview-12.3/knielsen_binlog_in_engine_11.8-asan/sql/sql_parse.cc:5975
|
#13 0x0000558fa10f9722 in mysql_parse (thd=0x62c000320218, rawbuf=0x62d002738438 "/* WRK-6 QNO 2886 */ UPDATE IGNORE `dbt3_db`.`lineitem` SET `l_shipinstruct` = DEFAULT ORDER BY `l_receiptDATE` LIMIT 4", length=120, parser_state=0x7f77ce630a30) at /data/bld/preview-12.3/knielsen_binlog_in_engine_11.8-asan/sql/sql_parse.cc:7914
|
#14 0x0000558fa10d043f in dispatch_command (command=COM_QUERY, thd=0x62c000320218, packet=0x7f77cd3bf819 "/* WRK-6 QNO 2886 */ UPDATE IGNORE `dbt3_db`.`lineitem` SET `l_shipinstruct` = DEFAULT ORDER BY `l_receiptDATE` LIMIT 4 ", packet_length=121, blocking=true) at /data/bld/preview-12.3/knielsen_binlog_in_engine_11.8-asan/sql/sql_parse.cc:1903
|
When InnoDB finds a deadlock error it rolls back the transaction. Debugging on the core and datadir of the crashed server shows that the XA transaction gets partially rolled back (thd->transaction->all.ha_list == 0, so ha_commit_trans thinks this is a single-statement transaction), but the XA state is not cleared (thd->transaction->xid_state.is_explicit_XA() is true because thd->transaction->xid_state.xid_cache_element is non-NULL).
The bug is that the XA state is not cleared. This causes the XA transaction to become prepared inside InnoDB (as seen when starting a mariadb server on the crashed datadir of the RQG run), which is very wrong; and it also causes the assertion mentioned in a debug build.
It is not clear how the PREPARE stmt_ExecuteAsPS_3346080 can cause a deadlock error, possibly related to updating statistics tables or something like that.
It seems however very likely that the problem is the incorrect handling of the XA state when the PREPARE stmt_ExecuteAsPS_3346080 causes a deadlock and a transaction rollback as described; this is a corner case that was likely overlooked in the code.