[MDEV-39477] Assertion: tail.trx_no <= last_trx_no under HammerDB TPROC‑C with binlog=OFF at 100–200 users - Jira

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 13.0, 11.8.6, 12.3.1
Fix Version/s: None
Component/s: None
Labels:
None
Environment:
CPU: 13th Gen Intel(R) Core(TM) i5-13500
CPU Count: 20
Socket Count: 1
OS: AlmaLinux 9.7 (Moss Jungle Cat)
RAM: 62.33 GB

Description

Attempting to run HammerDB TPROC-C using 100 warehouses cachable, binlog disabled, and threads running up to 400 vu cause Failing assertion: tail.trx_no <= last_trx_no.

The cores were produced from the following binaries:

* mariadb-11.8.6-linux-systemd-x86_64

version_source_revision = 9bfea48ce1214cc4470f6f6f8a4e30352cef84e7

mariadb-12.3.1-linux-systemd-x86_64

version_source_revision = 21a0714a118614982d20bfa504763d7247800091

mariadb-13.0.0-linux-systemd-x86_64

version_source_revision = c5f6fd3e7c8a430f8d27a505bb8d2ae00f6396a6

Marko Mäkelä referenced the following MDEVs as context for purge subsystem behavior:

~~MDEV-31355~~ – purge correctness fix
~~MDEV-32050~~ – purge subsystem architectural changes
~~MDEV-34515~~ – additional purge subsystem improvements
~~MDEV-36845~~ – assertion fix (reported as present in 11.8.6, 12.2.2, 12.3.1, 10.11.16, 11.4.10)

Notes:

Datadir is initialized from scratch for every test case run.
Nothing in the datadir or environment is reused or modified between iterations.
The same assertion occurs in 11.8.6, 12.3.1, and 13.0.0.
Earlier runs with binlog ON were clean.
The same versions, same environment, same workload crash when:
binlog = OFF
user count = 100 or 200
Each user count runs 3 iterations and can be seen in either the iterations of 100 vu, or 200. 400 is never reached.
Therefore this is not inherited corruption; it is runtime behavior.

Workload:

HammerDB TPROC-C
Warehouses: 100 (cached on host)
VU/threads: 100 and 200 (crashes occur here)
Checksums enabled after setup and after each run
JSON and HTML reporting enabled
timeprofile disabled
vuset delay 0

Attempt to reduce workload.

Tried many reduced configurations and while was able to reproduce in some reduced configurations, not cosistently.
Lowest number of warehouses produced in 40.
Lowest innodb buffer pool size produced in 8GB

gdb) bt
#0 0x00007f268ca8d02c in __pthread_kill_implementation () from /lib64/libc.so.6
#1 0x00007f268ca3fb86 in raise () from /lib64/libc.so.6
#2 0x00007f268ca29905 in abort () from /lib64/libc.so.6
#3 0x000055e2ec9f82a8 in ut_dbg_assertion_failed (expr=expr@entry=0x55e2ed98d07f "tail.trx_no <= last_trx_no",
file=file@entry=0x55e2ed98ce08 "/home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc", line=line@entry=879)
at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/ut/ut0dbg.cc:60
#4 0x000055e2ec9f1e22 in purge_sys_t::choose_next_log (this=this@entry=0x55e2eec56c00 <purge_sys>, trx=trx@entry=0x7f268912d180)
at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:879
#5 0x000055e2ed39b5e7 in purge_sys_t::rseg_get_next_history_log (this=this@entry=0x55e2eec56c00 <purge_sys>, trx=trx@entry=0x7f268912d180)
at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:841
#6 0x000055e2ed39cbb7 in purge_sys_t::get_next_rec (roll_ptr=9851624219089486, trx=0x7f268912d180, this=0x55e2eec56c00 <purge_sys>)
at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:997
#7 purge_sys_t::fetch_next_rec (trx=0x7f268912d180, this=0x55e2eec56c00 <purge_sys>) at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:1037
#8 trx_purge_attach_undo_recs (n_work_items=<synthetic pointer>, trx=0x7f268912d180) at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:1233
#9 trx_purge (trx=trx@entry=0x7f268912d180, n_tasks=n_tasks@entry=4, history_size=<optimized out>) at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:1364
#10 0x000055e2ed38c4d8 in purge_coordinator_state::do_purge (trx=0x7f268912d180, this=0x55e2eec55fa0 <purge_state>) at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/srv/srv0srv.cc:1431
#11 purge_coordinator_callback () at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/srv/srv0srv.cc:1525
#12 0x000055e2ed4b116c in tpool::task_group::execute (this=0x55e2eec55e00 <purge_coordinator_task_group>, t=0x55e2eec55d60 <purge_coordinator_task>)
at /home/buildbot/amd64-almalinux-8-bintar/build/tpool/task_group.cc:73
#13 0x000055e2ed4aef7f in tpool::thread_pool_generic::worker_main (this=0x55e2f08eb7a0, thread_var=0x55e2f0c014a0) at /home/buildbot/amd64-almalinux-8-bintar/build/tpool/tpool_generic.cc:531
#14 0x00007f268cedbae4 in execute_native_thread_routine () from /lib64/libstdc++.so.6
#15 0x00007f268ca8b2ea in start_thread () from /lib64/libc.so.6
#16 0x00007f268cb103d0 in clone3 () from /lib64/libc.so.6
(gdb) frame 4
#4 0x000055e2ec9f1e22 in purge_sys_t::choose_next_log (this=this@entry=0x55e2eec56c00 <purge_sys>, trx=trx@entry=0x7f268912d180)
at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:879
warning: 879 /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc: No such file or directory
(gdb) info locals
last_trx_no = <optimized out>
(gdb) info args
this = 0x55e2eec56c00 <purge_sys>
trx = 0x7f268912d180

13 log#

Same exact for 11.8.6 and 12.3.1.

2026-04-16 02:07:10 0x7775877fe640 InnoDB: Assertion failure in file /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc line 879
InnoDB: Failing assertion: tail.trx_no <= last_trx_no
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mariadbd startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
260416 2:07:10 [ERROR] /home/jeb/taf-perl/database_software_installs/mariadb-13.0.0-linux-systemd-x86_64/bin/mariadbd got signal 6 ;
Sorry, we probably made a mistake, and this is a bug.

Your assistance in bug reporting will enable us to fix this for the next release.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs about how to report
a bug on https://jira.mariadb.org/.

Please include the information from the server start above, to the end of the
information below.

Server version: 13.0.0-MariaDB source revision: c5f6fd3e7c8a430f8d27a505bb8d2ae00f6396a6

The information page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/
contains instructions to obtain a better version of the backtrace below.
Following these instructions will help MariaDB developers provide a fix quicker.

Attempting backtrace. Include this in the bug report.
(note: Retrieving this information may fail)

Thread pointer: 0x56480b461c48
stack_bottom = 0x7775877ff000 thread_stack 0x49000
mysys/stacktrace.c:216(my_print_stacktrace)[0x56480871de4e]
sql/signal_handler.cc:230(handle_fatal_signal)[0x5648081515fd]
/lib64/libc.so.6(+0x3fc30)[0x7f760e83fc30]
/lib64/libc.so.6(+0x8d02c)[0x7f760e88d02c]
/lib64/libc.so.6(raise+0x16)[0x7f760e83fb86]
/lib64/libc.so.6(abort+0xd3)[0x7f760e829873]
ut/ut0rbt.cc:460(rbt_eject_node(ib_rbt_node_t*, ib_rbt_node_t*) [clone .part.7])[0x564807bf82a8]
trx/trx0purge.cc:878(purge_sys_t::choose_next_log(trx_t*))[0x564807bf1e22]
trx/trx0purge.cc:842(purge_sys_t::rseg_get_next_history_log(trx_t*))[0x56480859b5e7]
trx/trx0purge.cc:1000(purge_sys_t::get_next_rec(trx_t*, unsigned long))[0x56480859cbb7]
srv/srv0srv.cc:1432(purge_coordinator_state::do_purge(trx_t*))[0x56480858c4d8]
tpool/task_group.cc:74(tpool::task_group::execute(tpool::task*))[0x5648086b116c]
tpool/tpool_generic.cc:529(tpool::thread_pool_generic::worker_main(tpool::worker_data*))[0x5648086aef7f]
/lib64/libstdc++.so.6(+0xdbae4)[0x7f760ecdbae4]
/lib64/libc.so.6(+0x8b2ea)[0x7f760e88b2ea]
/lib64/libc.so.6(+0x1103d0)[0x7f760e9103d0]

Connection ID (thread ID): 0
Status: NOT_KILLED
Query (0x0): (null)
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,duplicateweedout=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=on,cset_narrowing=on,sargable_casefold=on,reorder_outer_joins=off

Writing a core file...
Working directory at /data/data
Resource Limits (excludes unlimited resources):
Limit Soft Limit Hard Limit Units
Max stack size 8388608 unlimited bytes
Max processes 254911 254911 processes
Max open files 65535 65535 files
Max locked memory 8388608 8388608 bytes
Max pending signals 254911 254911 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %d %F

Kernel version: Linux version 5.14.0-611.35.1.el9_7.x86_64 (mockbuild@x64-builder02.almalinux.org) (gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-11), GNU ld version 2.35.2-67.el9_7.1) #1 SMP PREEMPT_DYNAMIC Wed Feb 25 03:46:09 EST 2026

Attached the exact TAF hammerdb_tprocc user properties and the MariaDB configuration used to produce the cores.

TAF lives at: https://github.com/MariaDB/TAF

Command used to reproduce:
perl ./taf.pl \
--prop=./properties/mariadb/mariadb_tprocc.properties \
--threads=100,200 \
--iter=6 \
--db-software-install-dir=/path/to/mariadb-13.0.0-linux-systemd-x86_64

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mariadb_cache.cnf
2026-04-28 19:42
1 kB
Jonathan Jeb Miller
mariadb_tprocc.properties
2026-04-28 19:43
5 kB
Jonathan Jeb Miller

Issue Links

relates to

MDEV-22718 trx_purge_truncate_history(): head.trx_no() >= purge_sys.low_limit_no()

In Progress

Assertion: tail.trx_no <= last_trx_no under HammerDB TPROC‑C with binlog=OFF at 100–200 users

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration