Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25072

Hang + Heap Corruption: SIGABRT in __libc_message from _int_free on DBUGCloseFile

    XMLWordPrintable

Details

    Description

      Keep repeating the following testcase repeatedly (using the CLI) until the server crashes:

      DROP DATABASE test;
      CREATE DATABASE test;
      USE test;
      SET GLOBAL innodb_limit_optimistic_insert_debug=7;
      CREATE TABLE t1 (c1 INT) PARTITION BY HASH (c1) PARTITIONS 15;
      SET GLOBAL innodb_change_buffering_debug=1;
      DROP TABLE t1;
      SELECT SUBSTRING ('00', 1, 1);
      

      Note that this can easily take 20+ minutes as the server gets very slow when executing this. Using a separate connection, you can see things like:

      10.6.0 27d66d644cf2ebe9201e0362f2050036cce2908a (Debug)

      10.6.0>show processlist;
      +----+------+-----------+------+---------+------+----------------+------------------+----------+
      | Id | User | Host      | db   | Command | Time | State          | Info             | Progress |
      +----+------+-----------+------+---------+------+----------------+------------------+----------+
      |  4 | root | localhost | test | Query   |  562 | closing tables | DROP TABLE t1    |    0.000 |
      |  5 | root | localhost | test | Query   |    0 | starting       | show processlist |    0.000 |
      +----+------+-----------+------+---------+------+----------------+------------------+----------+
      2 rows in set (0.000 sec)
      

      And the eventual outcome is:

      10.6.0 27d66d644cf2ebe9201e0362f2050036cce2908a (Debug)

      Core was generated by `/test/MD260221-mariadb-10.6.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
          at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      [Current thread is 1 (Thread 0x1500804a6700 (LWP 213801))]
      (gdb) bt
      #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      #1  0x0000560223b5eab8 in my_write_core (sig=sig@entry=6) at /test/10.6_dbg/mysys/stacktrace.c:424
      #2  0x00005602232f3317 in handle_fatal_signal (sig=6) at /test/10.6_dbg/sql/signal_handler.cc:331
      #3  <signal handler called>
      #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #5  0x0000150083064859 in __GI_abort () at abort.c:79
      #6  0x00001500830cf3ee in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x1500831f9285 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
      #7  0x00001500830d747c in malloc_printerr (str=str@entry=0x1500831f74ae "free(): invalid pointer") at malloc.c:5347
      #8  0x00001500830d8cac in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
      #9  0x0000560223b7a814 in DBUGCloseFile (cs=cs@entry=0x150034000ce0, new_value=new_value@entry=0x0) at /test/10.6_dbg/dbug/dbug.c:2012
      #10 0x0000560223b7a90c in FreeState (cs=cs@entry=0x150034000ce0, free_state=free_state@entry=1) at /test/10.6_dbg/dbug/dbug.c:1634
      #11 0x0000560223b7b1c9 in _db_pop_ () at /test/10.6_dbg/dbug/dbug.c:932
      #12 0x0000560223b63856 in safe_mutex_lock (mp=mp@entry=0x56022457acc0 <buf_pool>, my_flags=my_flags@entry=0, file=file@entry=0x560223f819c8 "/test/10.6_dbg/storage/innobase/buf/buf0buf.cc", line=line@entry=3171) at /test/10.6_dbg/mysys/thr_mutex.c:398
      #13 0x00005602239c3ab9 in inline_mysql_mutex_lock (src_line=3171, src_file=0x560223f819c8 "/test/10.6_dbg/storage/innobase/buf/buf0buf.cc", that=0x56022457acc0 <buf_pool>) at /test/10.6_dbg/include/mysql/psi/mysql_thread.h:765
      #14 buf_page_get_low (page_id=page_id@entry={m_id = 315}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=<optimized out>, guess@entry=0x0, mode=mode@entry=11, mtr=mtr@entry=0x1500804a1ca0, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /test/10.6_dbg/storage/innobase/buf/buf0buf.cc:3171
      #15 0x00005602239c3d21 in buf_page_get_gen (page_id={m_id = 315}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=11, mtr=mtr@entry=0x1500804a1ca0, err=0x1500804a0b8c, allow_ibuf_merge=true) at /test/10.6_dbg/storage/innobase/buf/buf0buf.cc:3341
      #16 0x0000560223984fdf in btr_cur_search_to_nth_level_func (index=index@entry=0x560226fbbbf8, level=<optimized out>, tuple=tuple@entry=0x1500340ba758, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=4098, cursor=cursor@entry=0x1500804a1a20, ahi_latch=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /test/10.6_dbg/storage/innobase/include/dict0mem.h:1165
      #17 0x00005602238afb68 in btr_pcur_open_low (index=index@entry=0x560226fbbbf8, level=<optimized out>, level@entry=0, tuple=tuple@entry=0x1500340ba758, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=4098, cursor=cursor@entry=0x1500804a1a20, autoinc=0, mtr=0x1500804a1ca0) at /test/10.6_dbg/storage/innobase/include/btr0pcur.ic:439
      #18 0x00005602238b055e in row_search_index_entry (index=index@entry=0x560226fbbbf8, entry=entry@entry=0x1500340ba758, mode=4098, pcur=pcur@entry=0x1500804a1a20, mtr=mtr@entry=0x1500804a1ca0) at /test/10.6_dbg/storage/innobase/row/row0row.cc:1301
      #19 0x00005602238d8448 in row_upd_sec_index_entry (node=node@entry=0x1500340f9e98, thr=thr@entry=0x1500340fa6e0) at /test/10.6_dbg/storage/innobase/row/row0upd.cc:2068
      #20 0x00005602238d9638 in row_upd_sec_step (thr=0x1500340fa6e0, node=0x1500340f9e98) at /test/10.6_dbg/storage/innobase/row/row0upd.cc:2267
      #21 row_upd (thr=0x1500340fa6e0, node=0x1500340f9e98) at /test/10.6_dbg/storage/innobase/row/row0upd.cc:3050
      #22 row_upd_step (thr=thr@entry=0x1500340fa6e0) at /test/10.6_dbg/storage/innobase/row/row0upd.cc:3165
      #23 0x000056022380e994 in que_thr_step (thr=0x1500340fa6e0) at /test/10.6_dbg/storage/innobase/que/que0que.cc:635
      #24 que_run_threads_low (thr=0x1500340fa6e0) at /test/10.6_dbg/storage/innobase/que/que0que.cc:709
      #25 que_run_threads (thr=0x1500340fa6e0) at /test/10.6_dbg/storage/innobase/que/que0que.cc:729
      #26 0x000056022380f021 in que_eval_sql (info=info@entry=0x1500340413f8, sql=sql@entry=0x560223f5d100 "PROCEDURE DROP_TABLE_PROC () IS\ntid CHAR;\niid CHAR;\nDECLARE CURSOR cur_idx IS\nSELECT ID FROM SYS_INDEXES\nWHERE TABLE_ID = tid FOR UPDATE;\nBEGIN\nSELECT ID INTO tid FROM SYS_TABLES\nWHERE NAME = :name FO"..., reserve_dict_mutex=reserve_dict_mutex@entry=false, trx=trx@entry=0x1500807612a0) at /test/10.6_dbg/storage/innobase/que/que0que.cc:779
      #27 0x000056022387e3cb in row_drop_table_for_mysql (name=<optimized out>, name@entry=0x1500804a2c40 "test/t1#P#p5", trx=trx@entry=0x1500807612a0, sqlcom=sqlcom@entry=SQLCOM_DROP_TABLE, create_failed=create_failed@entry=false, nonatomic=<optimized out>, nonatomic@entry=true) at /test/10.6_dbg/storage/innobase/row/row0mysql.cc:3593
      #28 0x00005602237120bb in ha_innobase::delete_table (this=this@entry=0x150034017428, name=<optimized out>, name@entry=0x1500804a3550 "./test/t1#P#p5", sqlcom=sqlcom@entry=SQLCOM_DROP_TABLE) at /test/10.6_dbg/storage/innobase/handler/ha_innodb.cc:13032
      #29 0x00005602236fab58 in ha_innobase::delete_table (this=0x150034017428, name=0x1500804a3550 "./test/t1#P#p5") at /test/10.6_dbg/storage/innobase/handler/ha_innodb.cc:13159
      #30 0x0000560223626742 in ha_partition::del_ren_table (this=this@entry=0x150034014528, from=from@entry=0x1500804a48a0 "./test/t1", to=to@entry=0x0) at /test/10.6_dbg/sql/ha_partition.cc:2495
      #31 0x00005602236269c4 in ha_partition::delete_table (this=0x150034014528, name=0x1500804a48a0 "./test/t1") at /test/10.6_dbg/sql/ha_partition.cc:631
      #32 0x00005602232fa284 in hton_drop_table (hton=<optimized out>, path=<optimized out>) at /test/10.6_dbg/sql/handler.cc:564
      #33 0x0000560223303f01 in ha_delete_table (thd=thd@entry=0x150034000db8, hton=hton@entry=0x560226d00128, path=path@entry=0x1500804a48a0 "./test/t1", db=db@entry=0x1500804a46a0, alias=alias@entry=0x1500804a46b0, generate_warning=<optimized out>) at /test/10.6_dbg/sql/handler.cc:2771
      #34 0x00005602230f22ef in mysql_rm_table_no_locks (thd=thd@entry=0x150034000db8, tables=tables@entry=0x150034013c10, if_exists=if_exists@entry=false, drop_temporary=drop_temporary@entry=false, drop_view=drop_view@entry=false, drop_sequence=drop_sequence@entry=false, dont_log_query=false, dont_free_locks=false) at /test/10.6_dbg/sql/sql_table.cc:2508
      #35 0x00005602230f405b in mysql_rm_table (thd=thd@entry=0x150034000db8, tables=tables@entry=0x150034013c10, if_exists=<optimized out>, drop_temporary=<optimized out>, drop_sequence=<optimized out>, dont_log_query=dont_log_query@entry=false) at /test/10.6_dbg/sql/sql_table.cc:2137
      #36 0x0000560223031310 in mysql_execute_command (thd=thd@entry=0x150034000db8) at /test/10.6_dbg/sql/structs.h:559
      #37 0x000056022301acdc in mysql_parse (thd=thd@entry=0x150034000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x1500804a53d0) at /test/10.6_dbg/sql/sql_parse.cc:7972
      #38 0x0000560223028b3b in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x150034000db8, packet=packet@entry=0x15003400b319 "DROP TABLE t1", packet_length=packet_length@entry=13, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_class.h:1295
      #39 0x000056022302bfbe in do_command (thd=0x150034000db8, blocking=blocking@entry=true) at /test/10.6_dbg/sql/sql_parse.cc:1397
      #40 0x0000560223187c2e in do_handle_one_connection (connect=<optimized out>, connect@entry=0x5602271c9468, put_in_cache=put_in_cache@entry=true) at /test/10.6_dbg/sql/sql_connect.cc:1410
      #41 0x0000560223188331 in handle_one_connection (arg=arg@entry=0x5602271c9468) at /test/10.6_dbg/sql/sql_connect.cc:1312
      #42 0x000056022363c9b9 in pfs_spawn_thread (arg=0x5602270edc98) at /test/10.6_dbg/storage/perfschema/pfs.cc:2201
      #43 0x0000150083572609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #44 0x0000150083161293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      This bug seems to show an underlying issue, and is closely related to another sporadic and hard to define bug (logged in MDEV-25071) I have been chasing for a long time, which happens when creating many partitions and shows the same "closing tables" as follows:

      10.6.0>show processlist;
      +----+------+-----------+------+---------+------+----------------+--------------------+----------+
      | Id | User | Host      | db   | Command | Time | State          | Info               | Progress |
      +----+------+-----------+------+---------+------+----------------+--------------------+----------+
      |  5 | root | localhost | test | Query   |   23 | closing tables | DROP DATABASE test |    0.000 |
      |  7 | root | localhost | test | Query   |    0 | starting       | show processlist   |    0.000 |
      +----+------+-----------+------+---------+------+----------------+--------------------+----------+
      2 rows in set (0.000 sec)
      

      I marked this particular bug as critical as it has cost many hours in test result debugging and is only present in 10.6 from what I can tell.

      Attachments

        1. mdev-25072_asan.err
          15 kB
          Roel Van de Paar
        2. mdev-25072_tsan.err
          1.42 MB
          Roel Van de Paar

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              Roel Roel Van de Paar
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.