Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-39477

Assertion: tail.trx_no <= last_trx_no under HammerDB TPROC‑C with binlog=OFF at 100–200 users

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 13.0, 11.8.6, 12.3.1
    • None
    • None
    • None
    • CPU: 13th Gen Intel(R) Core(TM) i5-13500
      CPU Count: 20
      Socket Count: 1
      OS: AlmaLinux 9.7 (Moss Jungle Cat)
      RAM: 62.33 GB

    Description

      Attempting to run HammerDB TPROC-C using 100 warehouses cachable, binlog disabled, and threads running up to 400 vu cause Failing assertion: tail.trx_no <= last_trx_no.

      The cores were produced from the following binaries:

      • * mariadb-11.8.6-linux-systemd-x86_64
      • version_source_revision = 9bfea48ce1214cc4470f6f6f8a4e30352cef84e7
      • mariadb-12.3.1-linux-systemd-x86_64
      • version_source_revision = 21a0714a118614982d20bfa504763d7247800091
      • mariadb-13.0.0-linux-systemd-x86_64
      • version_source_revision = c5f6fd3e7c8a430f8d27a505bb8d2ae00f6396a6

      Marko Mäkelä referenced the following MDEVs as context for purge subsystem behavior:

      • MDEV-31355 – purge correctness fix
      • MDEV-32050 – purge subsystem architectural changes
      • MDEV-34515 – additional purge subsystem improvements
      • MDEV-36845 – assertion fix (reported as present in 11.8.6, 12.2.2, 12.3.1, 10.11.16, 11.4.10)

      Notes:

      • Datadir is initialized from scratch for every test case run.
      • Nothing in the datadir or environment is reused or modified between iterations.
      • The same assertion occurs in 11.8.6, 12.3.1, and 13.0.0.
      • Earlier runs with binlog ON were clean.
      • The same versions, same environment, same workload crash when:
        binlog = OFF
        user count = 100 or 200
      • Each user count runs 3 iterations and can be seen in either the iterations of 100 vu, or 200. 400 is never reached.
      • Therefore this is not inherited corruption; it is runtime behavior.

      Workload:

      • HammerDB TPROC-C
      • Warehouses: 100 (cached on host)
      • VU/threads: 100 and 200 (crashes occur here)
      • Checksums enabled after setup and after each run
      • JSON and HTML reporting enabled
      • timeprofile disabled
      • vuset delay 0

      Attempt to reduce workload.

      • Tried many reduced configurations and while was able to reproduce in some reduced configurations, not cosistently.
      • Lowest number of warehouses produced in 40.
      • Lowest innodb buffer pool size produced in 8GB

      gdb) bt
      #0 0x00007f268ca8d02c in __pthread_kill_implementation () from /lib64/libc.so.6
      #1 0x00007f268ca3fb86 in raise () from /lib64/libc.so.6
      #2 0x00007f268ca29905 in abort () from /lib64/libc.so.6
      #3 0x000055e2ec9f82a8 in ut_dbg_assertion_failed (expr=expr@entry=0x55e2ed98d07f "tail.trx_no <= last_trx_no",
      file=file@entry=0x55e2ed98ce08 "/home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc", line=line@entry=879)
      at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/ut/ut0dbg.cc:60
      #4 0x000055e2ec9f1e22 in purge_sys_t::choose_next_log (this=this@entry=0x55e2eec56c00 <purge_sys>, trx=trx@entry=0x7f268912d180)
      at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:879
      #5 0x000055e2ed39b5e7 in purge_sys_t::rseg_get_next_history_log (this=this@entry=0x55e2eec56c00 <purge_sys>, trx=trx@entry=0x7f268912d180)
      at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:841
      #6 0x000055e2ed39cbb7 in purge_sys_t::get_next_rec (roll_ptr=9851624219089486, trx=0x7f268912d180, this=0x55e2eec56c00 <purge_sys>)
      at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:997
      #7 purge_sys_t::fetch_next_rec (trx=0x7f268912d180, this=0x55e2eec56c00 <purge_sys>) at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:1037
      #8 trx_purge_attach_undo_recs (n_work_items=<synthetic pointer>, trx=0x7f268912d180) at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:1233
      #9 trx_purge (trx=trx@entry=0x7f268912d180, n_tasks=n_tasks@entry=4, history_size=<optimized out>) at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:1364
      #10 0x000055e2ed38c4d8 in purge_coordinator_state::do_purge (trx=0x7f268912d180, this=0x55e2eec55fa0 <purge_state>) at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/srv/srv0srv.cc:1431
      #11 purge_coordinator_callback () at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/srv/srv0srv.cc:1525
      #12 0x000055e2ed4b116c in tpool::task_group::execute (this=0x55e2eec55e00 <purge_coordinator_task_group>, t=0x55e2eec55d60 <purge_coordinator_task>)
      at /home/buildbot/amd64-almalinux-8-bintar/build/tpool/task_group.cc:73
      #13 0x000055e2ed4aef7f in tpool::thread_pool_generic::worker_main (this=0x55e2f08eb7a0, thread_var=0x55e2f0c014a0) at /home/buildbot/amd64-almalinux-8-bintar/build/tpool/tpool_generic.cc:531
      #14 0x00007f268cedbae4 in execute_native_thread_routine () from /lib64/libstdc++.so.6
      #15 0x00007f268ca8b2ea in start_thread () from /lib64/libc.so.6
      #16 0x00007f268cb103d0 in clone3 () from /lib64/libc.so.6
      (gdb) frame 4
      #4 0x000055e2ec9f1e22 in purge_sys_t::choose_next_log (this=this@entry=0x55e2eec56c00 <purge_sys>, trx=trx@entry=0x7f268912d180)
      at /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc:879
      warning: 879 /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc: No such file or directory
      (gdb) info locals
      last_trx_no = <optimized out>
      (gdb) info args
      this = 0x55e2eec56c00 <purge_sys>
      trx = 0x7f268912d180

      13 log#

      Same exact for 11.8.6 and 12.3.1.

      2026-04-16 02:07:10 0x7775877fe640 InnoDB: Assertion failure in file /home/buildbot/amd64-almalinux-8-bintar/build/storage/innobase/trx/trx0purge.cc line 879
      InnoDB: Failing assertion: tail.trx_no <= last_trx_no
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mariadbd startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
      InnoDB: about forcing recovery.
      260416 2:07:10 [ERROR] /home/jeb/taf-perl/database_software_installs/mariadb-13.0.0-linux-systemd-x86_64/bin/mariadbd got signal 6 ;
      Sorry, we probably made a mistake, and this is a bug.

      Your assistance in bug reporting will enable us to fix this for the next release.
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs about how to report
      a bug on https://jira.mariadb.org/.

      Please include the information from the server start above, to the end of the
      information below.

      Server version: 13.0.0-MariaDB source revision: c5f6fd3e7c8a430f8d27a505bb8d2ae00f6396a6

      The information page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/
      contains instructions to obtain a better version of the backtrace below.
      Following these instructions will help MariaDB developers provide a fix quicker.

      Attempting backtrace. Include this in the bug report.
      (note: Retrieving this information may fail)

      Thread pointer: 0x56480b461c48
      stack_bottom = 0x7775877ff000 thread_stack 0x49000
      mysys/stacktrace.c:216(my_print_stacktrace)[0x56480871de4e]
      sql/signal_handler.cc:230(handle_fatal_signal)[0x5648081515fd]
      /lib64/libc.so.6(+0x3fc30)[0x7f760e83fc30]
      /lib64/libc.so.6(+0x8d02c)[0x7f760e88d02c]
      /lib64/libc.so.6(raise+0x16)[0x7f760e83fb86]
      /lib64/libc.so.6(abort+0xd3)[0x7f760e829873]
      ut/ut0rbt.cc:460(rbt_eject_node(ib_rbt_node_t*, ib_rbt_node_t*) [clone .part.7])[0x564807bf82a8]
      trx/trx0purge.cc:878(purge_sys_t::choose_next_log(trx_t*))[0x564807bf1e22]
      trx/trx0purge.cc:842(purge_sys_t::rseg_get_next_history_log(trx_t*))[0x56480859b5e7]
      trx/trx0purge.cc:1000(purge_sys_t::get_next_rec(trx_t*, unsigned long))[0x56480859cbb7]
      srv/srv0srv.cc:1432(purge_coordinator_state::do_purge(trx_t*))[0x56480858c4d8]
      tpool/task_group.cc:74(tpool::task_group::execute(tpool::task*))[0x5648086b116c]
      tpool/tpool_generic.cc:529(tpool::thread_pool_generic::worker_main(tpool::worker_data*))[0x5648086aef7f]
      /lib64/libstdc++.so.6(+0xdbae4)[0x7f760ecdbae4]
      /lib64/libc.so.6(+0x8b2ea)[0x7f760e88b2ea]
      /lib64/libc.so.6(+0x1103d0)[0x7f760e9103d0]

      Connection ID (thread ID): 0
      Status: NOT_KILLED
      Query (0x0): (null)
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,duplicateweedout=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=on,cset_narrowing=on,sargable_casefold=on,reorder_outer_joins=off

      Writing a core file...
      Working directory at /data/data
      Resource Limits (excludes unlimited resources):
      Limit Soft Limit Hard Limit Units
      Max stack size 8388608 unlimited bytes
      Max processes 254911 254911 processes
      Max open files 65535 65535 files
      Max locked memory 8388608 8388608 bytes
      Max pending signals 254911 254911 signals
      Max msgqueue size 819200 819200 bytes
      Max nice priority 0 0
      Max realtime priority 0 0
      Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %d %F

      Kernel version: Linux version 5.14.0-611.35.1.el9_7.x86_64 (mockbuild@x64-builder02.almalinux.org) (gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-11), GNU ld version 2.35.2-67.el9_7.1) #1 SMP PREEMPT_DYNAMIC Wed Feb 25 03:46:09 EST 2026

      Attached the exact TAF hammerdb_tprocc user properties and the MariaDB configuration used to produce the cores.

      TAF lives at: https://github.com/MariaDB/TAF

      Command used to reproduce:
      perl ./taf.pl \
      --prop=./properties/mariadb/mariadb_tprocc.properties \
      --threads=100,200 \
      --iter=6 \
      --db-software-install-dir=/path/to/mariadb-13.0.0-linux-systemd-x86_64

      Attachments

        1. mariadb_cache.cnf
          1 kB
          Jonathan Jeb Miller
        2. mariadb_tprocc.properties
          5 kB
          Jonathan Jeb Miller

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jeb Jonathan Jeb Miller
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.