Details
-
Bug
-
Status: Open (View Workflow)
-
Critical
-
Resolution: Unresolved
-
11.8
-
Ubuntu 24.04.02 LTS running kernel 6.8.0-55-generic
Description
CS 11.8.1 6f1161aa34cbb178b00fc24cbc46e2e0e2af767a (Debug) Build 05/03/2025 |
Core was generated by `/test/MD050325-mariadb-11.8.1-linux-x86_64-dbg/bin/mariadbd --no-defaults --max'.
|
Program terminated with signal SIGABRT, Aborted.
|
#0 0x0000556cd0ecc351 in page_rec_check (rec=0x53853879e71f "")at include/page0page.inl:315
|
|
[Current thread is 1 (Thread 0x7c1d22df86c0 (LWP 1498795))]
|
(gdb) bt
|
#0 0x0000556cd0ecc351 in page_rec_check (rec=0x53853879e71f "")at include/page0page.inl:315
|
#1 0x0000556cd0ede1a5 in page_rec_is_supremum (rec=0x53853879e71f "")at include/page0page.inl:167
|
#2 0x0000556cd100eb95 in page_simple_validate_new (page=0x53853879c000 "")at /test/11.8_dbg/storage/innobase/page/page0page.cc:1875
|
#3 0x0000556cd0ff6ba7 in page_cur_delete_rec (cursor=0x478154351508, offsets=0x7c1d22df6a00, mtr=0x7c1d22df6e48)at /test/11.8_dbg/storage/innobase/page/page0cur.cc:2566
|
#4 0x0000556cd11ca517 in btr_cur_optimistic_delete (cursor=0x478154351508, flags=0, mtr=0x7c1d22df6e48)at /test/11.8_dbg/storage/innobase/btr/btr0cur.cc:4444
|
#5 0x0000556cd1324931 in row_undo_ins_remove_clust_rec (node=0x478154351498)at /test/11.8_dbg/storage/innobase/row/row0uins.cc:195
|
#6 0x0000556cd132284d in row_undo_ins (node=0x478154351498, thr=0x4781540aa518) at /test/11.8_dbg/storage/innobase/row/row0uins.cc:597
|
#7 0x0000556cd11072c0 in row_undo (node=0x478154351498, thr=0x4781540aa518)at /test/11.8_dbg/storage/innobase/row/row0undo.cc:401
|
#8 0x0000556cd1106f9c in row_undo_step (thr=0x4781540aa518)at /test/11.8_dbg/storage/innobase/row/row0undo.cc:442
|
#9 0x0000556cd1031fed in que_thr_step (thr=0x4781540aa518)at /test/11.8_dbg/storage/innobase/que/que0que.cc:551
|
#10 0x0000556cd10315f3 in que_run_threads_low (thr=0x4781540aa518)at /test/11.8_dbg/storage/innobase/que/que0que.cc:609
|
#11 0x0000556cd10313a4 in que_run_threads (thr=0x4781540aa518)at /test/11.8_dbg/storage/innobase/que/que0que.cc:629
|
#12 0x0000556cd115cf5f in trx_t::rollback_low (this=0x542a5b0a0680, savept=0x0)at /test/11.8_dbg/storage/innobase/trx/trx0roll.cc:121
|
#13 0x0000556cd115ddd1 in trx_rollback_for_mysql (trx=0x542a5b0a0680)at /test/11.8_dbg/storage/innobase/trx/trx0roll.cc:218
|
#14 0x0000556cd0ec6d17 in innobase_rollback (thd=0x478154000d58, rollback_trx=true)at /test/11.8_dbg/storage/innobase/handler/ha_innodb.cc:4765
|
#15 0x0000556cd0ac46f2 in ha_rollback_trans (thd=0x478154000d58, all=true)at /test/11.8_dbg/sql/handler.cc:2344
|
#16 0x0000556cd09dd88e in xa_trans_force_rollback (thd=0x478154000d58)at /test/11.8_dbg/sql/xa.cc:412
|
#17 0x0000556cd09df591 in trans_xa_detach (thd=0x478154000d58)at /test/11.8_dbg/sql/xa.cc:898
|
#18 0x0000556cd061309a in THD::cleanup (this=0x478154000d58)at /test/11.8_dbg/sql/sql_class.cc:1673
|
#19 0x0000556cd04f937a in unlink_thd (thd=0x478154000d58)at /test/11.8_dbg/sql/mysqld.cc:2865
|
#20 0x0000556cd088ba65 in do_handle_one_connection (connect=0x500687ce948, put_in_cache=true) at /test/11.8_dbg/sql/sql_connect.cc:1426
|
#21 0x0000556cd088b79e in handle_one_connection (arg=0x500687b6818)at /test/11.8_dbg/sql/sql_connect.cc:1327
|
#22 0x00002262041fbaa4 in start_thread (arg=<optimized out>)at ./nptl/pthread_create.c:447
|
#23 0x0000226204288a34 in clone ()at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
|
rr trace:
rr:/data/MDEV-36228/rr$ rr replay ./latest-trace
|
Roel, I was spending quite a bit of time on this, assuming that the SIGABRT is genuine. But I don’t see anything actually wrong in the execution, and it seems to me that an external process invoked killall -ABRT mariadbd or something similar. In MDEV-36231 this was more obvious.
Can you please double check the way how the test harness works? I would suggest to enable more optimization and also set cmake -DPLUGIN_PERFSCHEMA=NO -DWITH_DBUG_TRACE=OFF to avoid potential extreme slowdown due to an excessive amount of conditional branches. The disassembly of page_rec_check() suggests to me that not much optimization was enabled.
The current instruction at the time of the SIGABRT is only dereferencing the frame pointer register (rbp), and that memory address is valid. If it weren’t, I would expect a SIGSEGV rather than SIGABRT to be triggered.
As you can see from the stack trace, the current thread is rolling back a transaction, apparently after a client disconnect. That transaction had written 131,102 undo log records. By the time the process is forcibly killed by SIGABRT, it has rolled back most of them, but still 43,797 are awaiting rollback.