[MDEV-25297] InnoDB: Failing assertion: trx->roll_limit <= trx->undo_no in trx_rollback_start Created: 2021-03-30  Updated: 2021-04-20  Resolved: 2021-04-09

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6
Fix Version/s: 10.6.0

Type: Bug Priority: Blocker
Reporter: Roel Van de Paar Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: not-10.2, not-10.3, not-10.4, not-10.5, regression

Attachments: HTML File error_log     HTML File gdb_full_trace    
Issue Links:
Problem/Incident
causes MDEV-25401 Assertion `!is_set() || (m_status == ... Closed
is caused by MDEV-515 innodb bulk insert Closed
is caused by MDEV-24818 Concurrent use of InnoDB table is imp... Closed
Relates
relates to MDEV-24859 Assertion `static_cast<ins_node_t*>(t... Closed
relates to MDEV-25315 Assertion `ptr' failed in ut_align_do... Closed

 Description   

Note tables are InnoDB (CLI created)

SET sql_mode='';
SET unique_checks=0;
SET foreign_key_checks=0;
CREATE TABLE ti (b INT,c INT,e INT,id INT,KEY (b),KEY (e),PRIMARY KEY(id));
INSERT INTO ti VALUES(0,0,0,0);
ALTER TABLE ti CHANGE COLUMN c c BINARY (1);
XA START 'a';
CREATE TEMPORARY TABLE t(a INT);
INSERT INTO t VALUES(1);
SAVEPOINT a3;
CREATE OR REPLACE TEMPORARY TABLE t (a INT,b INT);
INSERT INTO t VALUES(0,0);
INSERT INTO ti VALUES(0,0,0,0);
ROLLBACK TO SAVEPOINT a3;

Leads to:

10.6.0 356c149603285086d964c8a51107be97b981c15c (Debug)

InnoDB: Failing assertion: trx->roll_limit <= trx->undo_no

10.6.0 356c149603285086d964c8a51107be97b981c15c (Debug)

Core was generated by `/test/MD270321-mariadb-10.6.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
[Current thread is 1 (Thread 0x14d5607c4700 (LWP 3602388))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x000055969b0bd343 in my_write_core (sig=sig@entry=6) at /test/10.6_dbg/mysys/stacktrace.c:424
#2  0x000055969a85dd1d in handle_fatal_signal (sig=6) at /test/10.6_dbg/sql/signal_handler.cc:331
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x000014d569c29859 in __GI_abort () at abort.c:79
#6  0x000055969aea265c in ut_dbg_assertion_failed (expr=expr@entry=0x55969b4cff50 "trx->roll_limit <= trx->undo_no", file=file@entry=0x55969b4cfeb8 "/test/10.6_dbg/storage/innobase/trx/trx0roll.cc", line=line@entry=869) at /test/10.6_dbg/storage/innobase/ut/ut0dbg.cc:60
#7  0x000055969ae79acf in trx_rollback_start (roll_limit=1, trx=0x14d560a7e168) at /test/10.6_dbg/storage/innobase/trx/trx0roll.cc:869
#8  trx_rollback_step (thr=thr@entry=0x14d50c024678) at /test/10.6_dbg/storage/innobase/trx/trx0roll.cc:935
#9  0x000055969ad70653 in que_thr_step (thr=0x14d50c024678) at /test/10.6_dbg/storage/innobase/que/que0que.cc:659
#10 que_run_threads_low (thr=0x14d50c024678) at /test/10.6_dbg/storage/innobase/que/que0que.cc:709
#11 que_run_threads (thr=thr@entry=0x14d50c024678) at /test/10.6_dbg/storage/innobase/que/que0que.cc:729
#12 0x000055969ae7d34d in trx_t::rollback_low (this=this@entry=0x14d560a7e168, savept=savept@entry=0x14d50c0080a0) at /test/10.6_dbg/storage/innobase/trx/trx0roll.cc:116
#13 0x000055969ae79ca0 in trx_t::rollback (this=this@entry=0x14d560a7e168, savept=savept@entry=0x14d50c0080a0) at /test/10.6_dbg/storage/innobase/trx/trx0roll.cc:164
#14 0x000055969ae7ba4d in trx_rollback_to_savepoint_for_mysql_low (mysql_binlog_cache_pos=0x14d5607c2cd8, savep=0x14d50c008098, trx=0x14d560a7e168) at /test/10.6_dbg/storage/innobase/trx/trx0roll.cc:419
#15 trx_rollback_to_savepoint_for_mysql (trx=trx@entry=0x14d560a7e168, savepoint_name=savepoint_name@entry=0x14d5607c2d00 "84AJ5M0EG", mysql_binlog_cache_pos=mysql_binlog_cache_pos@entry=0x14d5607c2cd8) at /test/10.6_dbg/storage/innobase/trx/trx0roll.cc:478
#16 0x000055969ac4745e in innobase_rollback_to_savepoint (hton=<optimized out>, thd=0x14d50c000db8, savepoint=0x14d50c019c88) at /test/10.6_dbg/storage/innobase/handler/ha_innodb.cc:4387
#17 0x000055969a863fa0 in ha_rollback_to_savepoint (thd=thd@entry=0x14d50c000db8, sv=sv@entry=0x14d50c019c50) at /test/10.6_dbg/sql/handler.cc:2502
#18 0x000055969a70b78c in trans_rollback_to_savepoint (thd=thd@entry=0x14d50c000db8, name=<optimized out>) at /test/10.6_dbg/sql/transaction.cc:697
#19 0x000055969a5a2b88 in mysql_execute_command (thd=thd@entry=0x14d50c000db8) at /test/10.6_dbg/sql/sql_parse.cc:5670
#20 0x000055969a58a264 in mysql_parse (thd=thd@entry=0x14d50c000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x14d5607c3410) at /test/10.6_dbg/sql/sql_parse.cc:8004
#21 0x000055969a598e6a in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x14d50c000db8, packet=packet@entry=0x14d50c00b359 "ROLLBACK TO SAVEPOINT a3", packet_length=packet_length@entry=24, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_class.h:1331
#22 0x000055969a59c245 in do_command (thd=0x14d50c000db8, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_parse.cc:1399
#23 0x000055969a6f5452 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55969cc4c8a8, put_in_cache=put_in_cache@entry=true) at /test/10.6_dbg/sql/sql_connect.cc:1410
#24 0x000055969a6f5a57 in handle_one_connection (arg=arg@entry=0x55969cc4c8a8) at /test/10.6_dbg/sql/sql_connect.cc:1312
#25 0x000055969ab9fea5 in pfs_spawn_thread (arg=0x55969cb70f48) at /test/10.6_dbg/storage/perfschema/pfs.cc:2201
#26 0x000014d56a137609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#27 0x000014d569d26293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.6.0 (dbg), 10.6.0 (opt)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.2.38 (dbg), 10.2.38 (opt), 10.3.29 (dbg), 10.3.29 (opt), 10.4.19 (dbg), 10.4.19 (opt), 10.5.10 (dbg), 10.5.10 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.33 (dbg), 5.7.33 (opt), 8.0.23 (dbg), 8.0.23 (opt)



 Comments   
Comment by Roel Van de Paar [ 2021-03-30 ]

Improved testcase

CREATE TABLE t1 (c INT KEY) ENGINE=InnoDB;
SET SESSION foreign_key_checks=0;
SET SESSION unique_checks=0;
XA START 'a';
CREATE TEMPORARY TABLE t2 (c INT) ENGINE=InnoDB;
INSERT INTO t1 VALUES(0);
SAVEPOINT x;
INSERT INTO t2 VALUES(0);
INSERT INTO t1 VALUES(0);
ROLLBACK TO SAVEPOINT x;

Comment by Sergei Golubchik [ 2021-03-30 ]

Roel, why would it need to disable FK and unique checks, if your tables use neither?

Comment by Roel Van de Paar [ 2021-03-31 ]

serg No idea. But it requires both statements. Removing one, or the other, or both, stops the server from crashing with the following result instead:

10.6.0>ROLLBACK TO SAVEPOINT x;
Query OK, 0 rows affected, 1 warning (0.000 sec)
 
10.6.0>SHOW WARNINGS;
+---------+------+---------------------------------------------------------------+
| Level   | Code | Message                                                       |
+---------+------+---------------------------------------------------------------+
| Warning | 1196 | Some non-transactional changed tables couldn't be rolled back |
+---------+------+---------------------------------------------------------------+
1 row in set (0.000 sec)

Comment by Roel Van de Paar [ 2021-03-31 ]

10.5 Outcome is the same as the one given in the last comment.

Comment by Roel Van de Paar [ 2021-03-31 ]

There is something else interesting here. A GDB trace provides this:

10.6.0 356c149603285086d964c8a51107be97b981c15c (Debug)

#6  0x0000558d42efc65c in ut_dbg_assertion_failed (expr=expr@entry=0x558d43529f50 "trx->roll_limit <= trx->undo_no", file=file@entry=0x558d43529eb8 "/test/10.6_dbg/storage/innobase/trx/trx0roll.cc", line=line@entry=869) at /test/10.6_dbg/storage/innobase/ut/ut0dbg.cc:60
#7  0x0000558d42ed3acf in trx_rollback_start (roll_limit=1, trx=0x1519af97f168) at /test/10.6_dbg/storage/innobase/trx/trx0roll.cc:869
#8  trx_rollback_step (thr=thr@entry=0x151968021198) at /test/10.6_dbg/storage/innobase/trx/trx0roll.cc:935
#9  0x0000558d42dca653 in que_thr_step (thr=0x151968021198) at /test/10.6_dbg/storage/innobase/que/que0que.cc:659

Versus the error log (on/for same crashed instance):

10.6.0 356c149603285086d964c8a51107be97b981c15c (Debug)

stdlib/abort.c:81(__GI_abort)[0x1519c453a859]
ut/ut0new.cc:84(ut_new_boot())[0x558d42efc65c]
trx/trx0roll.cc:835(trx_rollback_step(que_thr_t*))[0x558d42ed3acf]
que/que0que.cc:659(que_run_threads(que_thr_t*))[0x558d42dca653]

Why the difference? Perhaps two threads crashing at same time or similar?

Comment by Roel Van de Paar [ 2021-03-31 ]

Given the findings in the last comment, I also added:

gdb_full_trace: a comprehensive trace of all threads
error_log: a copy of the error log

Both for the same crashed instance as the one in the last comment.

Comment by Roel Van de Paar [ 2021-04-01 ]

Furthermore, in new testing I am seeing more issues with FK and unique checks disabled. I suspect Marko knows the cause as he mentioned disabling them in another bug.

Comment by Roel Van de Paar [ 2021-04-01 ]

MDEV-25315 May end up being a duplicate of this one. It has many new stacks, and all testcase disable both unique & FK checks.

Comment by Sergei Golubchik [ 2021-04-01 ]

Roel, ok, it seems they're needed because after MDEV-24818 they enable new 10.6 feature MDEV-515, InnoDB bulk insert

Comment by Roel Van de Paar [ 2021-04-02 ]

serg ack, thanks!

Comment by Marko Mäkelä [ 2021-04-08 ]

Here is a simpler test case. The tables may be temporary or persistent ones; it does not matter.

--source include/have_innodb.inc
CREATE TEMPORARY TABLE t1 (c INT KEY) ENGINE=InnoDB;
CREATE TEMPORARY TABLE t2 (c INT) ENGINE=InnoDB;
SET SESSION foreign_key_checks=0, unique_checks=0;
BEGIN;
INSERT INTO t1 VALUES(0);
SAVEPOINT x;
INSERT INTO t2 VALUES(0);
--error ER_DUP_ENTRY
INSERT INTO t1 VALUES(0);
ROLLBACK TO SAVEPOINT x;
COMMIT;
SELECT * FROM t1;
SELECT * FROM t2;
DROP TEMPORARY TABLE t1,t2;

The problem is that with the MDEV-24818 optimization, on any error we will intentionally roll back to the start of the transaction. We should probably reset all ‘promised’ savepoints to point to the start of the transaction at that point. The assertion fails, because we are attempting to roll back to a ‘future’ savepoint.

Comment by Marko Mäkelä [ 2021-04-09 ]

When the MDEV-24818 multi-statement bulk insert fails, we will roll back to the start of the transaction.

To fix this bug, we will discard any savepoints, so that ROLLBACK TO SAVEPOINT will report an error instead of attempting to roll back to a savepoint that is no longer valid.

Generated at Thu Feb 08 09:36:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.