Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25396

InnoDB: Failing assertion: m_prebuilt->trx == thd_to_trx(m_user_thd) in ha_innobase::change_active_index

    XMLWordPrintable

Details

    Description

      10.5 58f184a4c

      (rr) bt
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
      #1  0x00007f8417e27921 in __GI_abort () at abort.c:79
      #2  0x0000555b0783ae03 in ut_dbg_assertion_failed (expr=0x555b07ea7bc8 "m_prebuilt->trx == thd_to_trx(m_user_thd)", file=0x555b07ea61a8 "/home/mdbe/MDEV-25396/10.5-debug/storage/innobase/handler/ha_innodb.cc", line=8987)
          at /home/mdbe/MDEV-25396/10.5-debug/storage/innobase/ut/ut0dbg.cc:60
      #3  0x0000555b075a8d4b in ha_innobase::change_active_index (this=0x7f83c419ceb0, keynr=64) at /home/mdbe/MDEV-25396/10.5-debug/storage/innobase/handler/ha_innodb.cc:8987
      #4  0x0000555b075a98f8 in ha_innobase::rnd_init (this=0x7f83c419ceb0, scan=true) at /home/mdbe/MDEV-25396/10.5-debug/storage/innobase/handler/ha_innodb.cc:9270
      #5  0x0000555b06d826c3 in handler::ha_rnd_init (this=0x7f83c419ceb0, scan=true) at /home/mdbe/MDEV-25396/10.5-debug/sql/handler.h:3387
      #6  0x0000555b0714b313 in handler::update_first_row (this=0x7f83c419ceb0, new_data=0x7f83c451b818 "\377\001") at /home/mdbe/MDEV-25396/10.5-debug/sql/handler.cc:7234
      #7  0x0000555b07548a06 in ha_sequence::write_row (this=0x7f83c419d710, buf=0x7f83c451b818 "\377\001") at /home/mdbe/MDEV-25396/10.5-debug/sql/ha_sequence.cc:268
      #8  0x0000555b0714aad8 in handler::ha_write_row (this=0x7f83c419d710, buf=0x7f83c451b818 "\377\001") at /home/mdbe/MDEV-25396/10.5-debug/sql/handler.cc:7152
      #9  0x0000555b07051bbc in sequence_definition::write (this=0x7f83c4220560, table=0x7f83c4287df8, all_fields=false) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_sequence.cc:657
      #10 0x0000555b070522f7 in SEQUENCE::set_value (this=0x7f83c4220560, table=0x7f83c4287df8, next_val=3242028781753335809, next_round=0, is_used=true) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_sequence.cc:870
      #11 0x0000555b071d820d in Item_func_setval::val_int (this=0x7f83c4014b58) at /home/mdbe/MDEV-25396/10.5-debug/sql/item_func.cc:7167
      #12 0x0000555b07023d79 in Type_handler::Item_send_longlong (this=0x555b089c6060 <type_handler_slonglong>, item=0x7f83c4014b58, protocol=0x7f83c4001350, buf=0x7f83f00ac280) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_type.cc:7476
      #13 0x0000555b07032216 in Type_handler_longlong::Item_send (this=0x555b089c6060 <type_handler_slonglong>, item=0x7f83c4014b58, protocol=0x7f83c4001350, buf=0x7f83f00ac280) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_type.h:5739
      #14 0x0000555b06ca92e2 in Item::send (this=0x7f83c4014b58, protocol=0x7f83c4001350, buffer=0x7f83f00ac280) at /home/mdbe/MDEV-25396/10.5-debug/sql/item.h:1066
      #15 0x0000555b06ca2ab7 in Protocol::send_result_set_row (this=0x7f83c4001350, row_items=0x7f83c40140f8) at /home/mdbe/MDEV-25396/10.5-debug/sql/protocol.cc:1085
      #16 0x0000555b06d5c2f7 in select_send::send_data (this=0x7f83c4015578, items=...) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_class.cc:3018
      #17 0x0000555b06e7518d in select_result_sink::send_data_with_check (this=0x7f83c4015578, items=..., u=0x7f83c4004f28, sent=0) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_class.h:5341
      #18 0x0000555b06e2afd3 in JOIN::exec_inner (this=0x7f83c40155a0) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_select.cc:4337
      #19 0x0000555b06e2a86d in JOIN::exec (this=0x7f83c40155a0) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_select.cc:4249
      #20 0x0000555b06e2c0a9 in mysql_select (thd=0x7f83c4000d78, tables=0x0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2150369792, result=0x7f83c4015578, unit=0x7f83c4004f28, 
          select_lex=0x7f83c4013fa8) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_select.cc:4725
      #21 0x0000555b06e1bacf in handle_select (thd=0x7f83c4000d78, lex=0x7f83c4004e60, result=0x7f83c4015578, setup_tables_done_option=0) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_select.cc:419
      #22 0x0000555b06ddef11 in execute_sqlcom_select (thd=0x7f83c4000d78, all_tables=0x7f83c4014480) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_parse.cc:6307
      #23 0x0000555b06dd615f in mysql_execute_command (thd=0x7f83c4000d78) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_parse.cc:4003
      #24 0x0000555b06de3d35 in mysql_parse (thd=0x7f83c4000d78, rawbuf=0x7f83c4013ec0 "SELECT /* QNO 57963 CON_ID 14 */ SETVAL(seq3, 3242028781753335808)", length=66, parser_state=0x7f83f00ad510, is_com_multi=false, is_next_command=false)
          at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_parse.cc:8093
      #25 0x0000555b06dcfcb9 in dispatch_command (command=COM_QUERY, thd=0x7f83c4000d78, packet=0x7f83c40bca39 "", packet_length=66, is_com_multi=false, is_next_command=false) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_parse.cc:1889
      #26 0x0000555b06dce3f6 in do_command (thd=0x7f83c4000d78) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_parse.cc:1370
      #27 0x0000555b06f7869f in do_handle_one_connection (connect=0x555b0a9c5888, put_in_cache=true) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_connect.cc:1410
      #28 0x0000555b06f78408 in handle_one_connection (arg=0x555b0a9d9f68) at /home/mdbe/MDEV-25396/10.5-debug/sql/sql_connect.cc:1312
      #29 0x0000555b074c37c5 in pfs_spawn_thread (arg=0x555b0a9c47d8) at /home/mdbe/MDEV-25396/10.5-debug/storage/perfschema/pfs.cc:2201
      #30 0x00007f8418d226db in start_thread (arg=0x7f83f00ae700) at pthread_create.c:463
      #31 0x00007f8417f0871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      I have several test cases which are barely good enough to reproduce the failure occasionally, on some machines/builds better than on others; but none is reliable nearly enough to be usable for debugging, and the shorter the tests get, the less reliable they become. Because all test cases originate from the same concurrent test run, they have a lot in common, but some factors can be a pure happenstance. What seems to be always present is that

      • there is an XA transaction within which some DML on a table is performed;
      • after the transaction is ended but before it was committed, the table is dropped;
      • this happens while SESSION PSEUDO_SLAVE_MODE is set to ON;
      • it leads to the appearance of the note in the error log:

        2021-04-13 16:34:54 7 [Note] InnoDB: Deferring DROP TABLE `transforms`.`insert_select_898`; renaming to transforms/#sql-ib46
        

      • soon after, a seemingly unrelated temporary sequence is created and updated by SETVAL, at which point the crash happens – sometimes right away, and sometimes upon KILL_TIMEOUT;
      • every time when the crash happens, there is another thread performing the previously deferred drop in the background:

        Thread 11 (Thread 31893.31941):
        #0  __GI___pthread_getspecific (key=11) at pthread_getspecific.c:31
        #1  0x000055b903a24959 in _my_thread_var () at /home/mdbe/MDEV-25396/10.6-debug/mysys/my_thr_init.c:387
        #2  0x000055b903a24a89 in my_thread_var_dbug () at /home/mdbe/MDEV-25396/10.6-debug/mysys/my_thr_init.c:437
        #3  0x000055b903a511f4 in code_state () at /home/mdbe/MDEV-25396/10.6-debug/dbug/dbug.c:379
        #4  0x000055b903a53a71 in _db_pargs_ (_line_=400, keyword=0x55b903fcb8a0 "mutex") at /home/mdbe/MDEV-25396/10.6-debug/dbug/dbug.c:1264
        #5  0x000055b903a2ba2d in safe_mutex_lock (mp=0x55b904755498 <dict_sys+88>, my_flags=1, file=0x55b903f68b00 "/home/mdbe/MDEV-25396/10.6-debug/storage/innobase/dict/dict0dict.cc", line=1102)
            at /home/mdbe/MDEV-25396/10.6-debug/mysys/thr_mutex.c:400
        #6  0x000055b9038a1928 in inline_mysql_mutex_trylock (that=0x55b904755498 <dict_sys+88>, src_file=0x55b903f68b00 "/home/mdbe/MDEV-25396/10.6-debug/storage/innobase/dict/dict0dict.cc", src_line=1102)
            at /home/mdbe/MDEV-25396/10.6-debug/include/mysql/psi/mysql_thread.h:769
        #7  0x000055b9038a7dee in dict_sys_t::mutex_lock (this=0x55b904755440 <dict_sys>) at /home/mdbe/MDEV-25396/10.6-debug/storage/innobase/dict/dict0dict.cc:1102
        #8  0x000055b9038a7fc5 in dict_sys_t::lock (this=0x55b904755440 <dict_sys>, file=0x55b903ed90c8 "/home/mdbe/MDEV-25396/10.6-debug/storage/innobase/row/row0mysql.cc", line=3291)
            at /home/mdbe/MDEV-25396/10.6-debug/storage/innobase/dict/dict0dict.cc:1118
        #9  0x000055b90372fe4f in row_mysql_lock_data_dictionary_func (file=0x55b903ed90c8 "/home/mdbe/MDEV-25396/10.6-debug/storage/innobase/row/row0mysql.cc", line=3291, trx=0x7f317fa013d8)
            at /home/mdbe/MDEV-25396/10.6-debug/storage/innobase/row/row0mysql.cc:2278
        #10 0x000055b9037328e5 in row_drop_table_for_mysql (name=0x7f31540103a8 "transforms/#sql-ib46", trx=0x7f317fa013d8, sqlcom=SQLCOM_TRUNCATE, create_failed=false, nonatomic=true)
            at /home/mdbe/MDEV-25396/10.6-debug/storage/innobase/row/row0mysql.cc:3291
        #11 0x000055b903730bb6 in row_drop_table_for_mysql_in_background (name=0x7f31540103a8 "transforms/#sql-ib46") at /home/mdbe/MDEV-25396/10.6-debug/storage/innobase/row/row0mysql.cc:2553
        #12 0x000055b903730f37 in row_drop_tables_for_mysql_in_background () at /home/mdbe/MDEV-25396/10.6-debug/storage/innobase/row/row0mysql.cc:2640
        #13 0x000055b90379caf2 in srv_master_do_idle_tasks () at /home/mdbe/MDEV-25396/10.6-debug/storage/innobase/srv/srv0srv.cc:1651
        #14 0x000055b90379cfb3 in srv_master_callback () at /home/mdbe/MDEV-25396/10.6-debug/storage/innobase/srv/srv0srv.cc:1735
        #15 0x000055b903983d96 in tpool::thread_pool_generic::timer_generic::run (this=0x55b90733d070) at /home/mdbe/MDEV-25396/10.6-debug/tpool/tpool_generic.cc:313
        #16 0x000055b903983eb1 in tpool::thread_pool_generic::timer_generic::execute (arg=0x55b90733d070) at /home/mdbe/MDEV-25396/10.6-debug/tpool/tpool_generic.cc:333
        #17 0x000055b903988f6b in tpool::task::execute (this=0x55b90733d0b0) at /home/mdbe/MDEV-25396/10.6-debug/tpool/task.cc:52
        #18 0x000055b903982560 in tpool::thread_pool_generic::worker_main (this=0x55b906fb1030, thread_var=0x55b906fc0ad0) at /home/mdbe/MDEV-25396/10.6-debug/tpool/tpool_generic.cc:550
        #19 0x000055b903985c7f in std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__f=
            @0x55b907300f28: (void (tpool::thread_pool_generic::*)(tpool::thread_pool_generic * const, tpool::worker_data *)) 0x55b9039824c8 <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>, __t=@0x55b907300f20: 0x55b906fb1030)
            at /usr/include/c++/7/bits/invoke.h:73
        #20 0x000055b903984e19 in std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=
            @0x55b907300f28: (void (tpool::thread_pool_generic::*)(tpool::thread_pool_generic * const, tpool::worker_data *)) 0x55b9039824c8 <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>) at /usr/include/c++/7/bits/invoke.h:95
        #21 0x000055b903988873 in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (this=0x55b907300f18)
            at /usr/include/c++/7/thread:234
        #22 0x000055b90398880d in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=0x55b907300f18) at /usr/include/c++/7/thread:243
        #23 0x000055b9039887ec in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run (this=0x55b907300f10)
            at /usr/include/c++/7/thread:186
        #24 0x00007f318e20a6df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
        #25 0x00007f318e6e16db in start_thread (arg=0x7f315f7fe700) at pthread_create.c:463
        #26 0x00007f318d8c771f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
        

      I don't know if the use of SEQUENCEs is necessary, that's just what the test does. I tried to replace it with a temporary table, didn't get any luck, but due to the poor quality of the test case it is no guarantee.
      PSEUDO_SLAVE_MODE seems necessary because without it the "deferring" doesn't happen (assuming that the deferred drop is important for the outcome).

      Reproduced on 10.5 and 10.6, different builds.
      Couldn't reproduce on 10.4, but it doesn't necessarily mean that the problem isn't there.

      rr profiles for 10.5 and 10.6 are available.

      Attachments

        Activity

          People

            marko Marko Mäkelä
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.