Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23885

Crash in ha_delete_hash_node() in btr_search_drop_page_hash_index()

Details

    Description

      The following crash happened:

      200930 15:45:14 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.
       
      Server version: 10.4.13-MariaDB-log
      key_buffer_size=10485760
      read_buffer_size=262144
      max_used_connections=595
      max_threads=65537
      thread_count=579
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 151056887 K bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x56277b0e6008
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f053cc25be0 thread_stack 0x49000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5625762666ce]
      /usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x562575cfcf7f]
      /lib64/libpthread.so.0(+0xf6d0)[0x7f055846b6d0]
      /usr/sbin/mysqld(+0xc6f0f3)[0x5625760bc0f3]
      /usr/sbin/mysqld(+0xc6f2a0)[0x5625760bc2a0]
      /usr/sbin/mysqld(+0xbd0fae)[0x56257601dfae]
      /usr/sbin/mysqld(+0xbaba2d)[0x562575ff8a2d]
      /usr/sbin/mysqld(+0xbb8e32)[0x562576005e32]
      /usr/sbin/mysqld(+0xbbf11d)[0x56257600c11d]
      /usr/sbin/mysqld(+0xb46163)[0x562575f93163]
      /usr/sbin/mysqld(+0xb4b2ae)[0x562575f982ae]
      /usr/sbin/mysqld(+0xb4c084)[0x562575f99084]
      /usr/sbin/mysqld(+0xb22c5b)[0x562575f6fc5b]
      /usr/sbin/mysqld(+0xa66bbc)[0x562575eb3bbc]
      /usr/sbin/mysqld(_ZN7handler13ha_update_rowEPKhS1_+0xc3)[0x562575d08723]
      /usr/sbin/mysqld(_Z12mysql_updateP3THDP10TABLE_LISTR4ListI4ItemES6_PS4_jP8st_orderybPySA_+0x21a2)[0x562575ba1cf2]
      /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x2ff5)[0x562575af8d15]
      /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x1fb)[0x562575afe8ab]
      /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x1957)[0x562575b00a37]
      /usr/sbin/mysqld(_Z10do_commandP3THD+0x109)[0x562575b02049]
      /usr/sbin/mysqld(_Z11tp_callbackP13TP_connection+0x1d8)[0x562575ccf4c8]
      /usr/sbin/mysqld(+0xa3b190)[0x562575e88190]
      /usr/sbin/mysqld(+0xdcb78d)[0x56257621878d]
      /lib64/libpthread.so.0(+0x7e25)[0x7f0558463e25]
      /lib64/libc.so.6(clone+0x6d)[0x7f0556806bad]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x56277b0ec820): update db.USER_LOCK set SESSION_KEY=_binary 'm ^ \
      Connection ID (thread ID): 18613
      Status: NOT_KILLED
      

      In the gdb backtrace from the core we see the following crashing thread:

      Thread 1 (Thread 0x7f053cc26700 (LWP 36992)):
      #0 0x00007f0558468a01 in pthread_sigmask () from /lib64/libpthread.so.0
      #1 0x0000562575cfcffe in handle_fatal_signal (sig=11) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/signal_handler.cc:343
      #2 0x00007f055846b6d0 in __libc_sigaction () from /lib64/libpthread.so.0
      #3 0x0000000000087007 in ?? ()
      #4 0x0000000000000008 in ?? ()
      #5 0x00007f053cc212c0 in ?? ()
      #6 0x0000562575ee98db in pfs_end (this=0x2aca40d34000, ret=0, locker=0xb751230ca3b2a847) at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/include/ib0mutex.h:742
      #7 PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x2aca40d34000, n_spins=17, n_delay=10954, line=1993551968,
      name=0x56257646c480 "/home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.4.13/storage/innobase/lock/lock0lock.cc")
      at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/include/ib0mutex.h:598
      #8 0x000056257601dfae in btr_search_drop_page_hash_index (block=<optimized out>) at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/btr/btr0sea.cc:1248
      #9 0x0000562575ff8a2d in btr_page_reorganize_low (recovery=recovery@entry=false, z_level=6, cursor=cursor@entry=0x562751d1b908, index=0x562750233668,
      mtr=mtr@entry=0x7f053cc226e0) at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/btr/btr0btr.cc:1472
      #10 0x0000562575ff8d7e in btr_page_reorganize (cursor=cursor@entry=0x562751d1b908, index=<optimized out>, mtr=mtr@entry=0x7f053cc226e0)
      at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/btr/btr0btr.cc:1715
      #11 0x0000562576005e32 in btr_cur_insert_if_possible (cursor=cursor@entry=0x562751d1b900, tuple=tuple@entry=0x56276224d4f8, offsets=offsets@entry=0x7f053cc21bb8,
      heap=heap@entry=0x7f053cc21c80, n_ext=n_ext@entry=0, mtr=mtr@entry=0x7f053cc226e0) at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/btr/btr0cur.cc:3246
      #12 0x000056257600c11d in btr_cur_optimistic_update (flags=flags@entry=2, cursor=cursor@entry=0x562751d1b900, offsets=offsets@entry=0x7f053cc21bb8,
      heap=heap@entry=0x7f053cc21c80, update=0x56276cb78b90, cmpl_info=0, thr=thr@entry=0x56276cb78fd8, trx_id=1257564648, mtr=mtr@entry=0x7f053cc226e0)
      at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/btr/btr0cur.cc:4740
      #13 0x0000562575f93163 in row_upd_clust_rec (flags=flags@entry=0, node=node@entry=0x56276cb78a70, index=index@entry=0x562750233668, offsets=offsets@entry=0x7f053cc22220,
      offsets_heap=offsets_heap@entry=0x7f053cc21c80, thr=thr@entry=0x56276cb78fd8, mtr=mtr@entry=0x7f053cc226e0)
      at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/row/row0upd.cc:2868
      #14 0x0000562575f982ae in row_upd_clust_step (node=node@entry=0x56276cb78a70, thr=thr@entry=0x56276cb78fd8)
      at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/row/row0upd.cc:3233
      #15 0x0000562575f99084 in row_upd (thr=0x56276cb78fd8, node=0x56276cb78a70) at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/row/row0upd.cc:3298
      #16 row_upd_step (thr=thr@entry=0x56276cb78fd8) at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/row/row0upd.cc:3442
      #17 0x0000562575f6fc5b in row_update_for_mysql (prebuilt=0x56276cb78068) at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/row/row0mysql.cc:1889
      #18 0x0000562575eb3bbc in ha_innobase::update_row (this=0x562769e6c020, old_row=0x562765916220 "\036\377n\216\317^\a", new_row=0x562765916020 "\002\377n\216\317^\a")
      at /usr/src/debug/MariaDB-10.4.13/src_0/storage/innobase/handler/ha_innodb.cc:8823
      #19 0x0000562575d08723 in handler::ha_update_row (this=0x562769e6c020, old_data=0x562765916220 "\036\377n\216\317^\a", new_data=0x562765916020 "\002\377n\216\317^\a")
      at /usr/src/debug/MariaDB-10.4.13/src_0/sql/handler.cc:6751
      #20 0x0000562575ba1cf2 in mysql_update (thd=thd@entry=0x56277b0e6008, table_list=0x56277b0ecac8, fields=..., values=..., conds=0x56277b0edbf8, order_num=<optimized out>,
      order=<optimized out>, limit=18446744073709551615, ignore=false, found_return=found_return@entry=0x7f053cc23500, updated_return=updated_return@entry=0x7f053cc235c0)
      at /usr/src/debug/MariaDB-10.4.13/src_0/sql/sql_update.cc:1057
      #21 0x0000562575af8d15 in mysql_execute_command (thd=thd@entry=0x56277b0e6008) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/sql_parse.cc:4376
      #22 0x0000562575afe8ab in mysql_parse (thd=thd@entry=0x56277b0e6008, rawbuf=<optimized out>, length=213, parser_state=parser_state@entry=0x7f053cc25260,
      is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/sql_parse.cc:7900
      #23 0x0000562575b00a37 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x56277b0e6008,
      packet=packet@entry=0x562779568009 "update db.USER_LOCK set SESSION_KEY=_binary 'm\216\317^\a\\", packet_length=packet_length@entry=213,
      is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/sql_parse.cc:1842
      #24 0x0000562575b02049 in do_command (thd=thd@entry=0x56277b0e6008) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/sql_parse.cc:1360
      #25 0x0000562575ccf4c8 in threadpool_process_request (thd=0x56277b0e6008) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/threadpool_common.cc:365
      #26 tp_callback (c=0x5627606b5f80) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/threadpool_common.cc:192
      #27 0x0000562575e88190 in worker_main (param=param@entry=0x56274db6e3c8) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/threadpool_generic.cc:1597
      #28 0x000056257621878d in pfs_spawn_thread (arg=0x5627905c4a08) at /usr/src/debug/MariaDB-10.4.13/src_0/storage/perfschema/pfs.cc:1869
      #29 0x00007f0558463e25 in start_thread () from /lib64/libpthread.so.0
      #30 0x00007f0556806bad in __opensock () from /lib64/libc.so.6
      #31 0x0000000000000000 in ?? ()
      
      

      Attachments

        Issue Links

          Activity

            In 10.4.13, the identified source code line in btr_search_drop_page_hash_index() appears to be as follows:

            	rw_lock_x_lock(ahi_latch);
             
            	if (!btr_search_enabled) {
            

            I do not think that we should use ib_mutex_t for any rw_lock_t operations in release builds. thiru, please work with the customer and try to identify the mutex and also get a disassembly of both btr_search_drop_page_hash_index() and the code that it is calling.

            Since 10.4.13, we have made several changes to the code, including MDEV-22456. We do not yet have enough information about this crash to say whether it has been fixed.

            marko Marko Mäkelä added a comment - In 10.4.13, the identified source code line in btr_search_drop_page_hash_index() appears to be as follows: rw_lock_x_lock(ahi_latch);   if (!btr_search_enabled) { I do not think that we should use ib_mutex_t for any rw_lock_t operations in release builds. thiru , please work with the customer and try to identify the mutex and also get a disassembly of both btr_search_drop_page_hash_index() and the code that it is calling. Since 10.4.13, we have made several changes to the code, including MDEV-22456 . We do not yet have enough information about this crash to say whether it has been fixed.

            Sorry, I was mistakenly looking at btr0sea.cc:1472 (the caller was btr0btr.cc:1472). The line btr0sea.cc:1248 is the last line of the following statement:

                            ha_remove_all_nodes_to_page(
                                    btr_search_sys->hash_tables[ahi_slot],
                                    folds[i], page);
            

            It still remains to be resolved which mutex we would be acquiring here. In 10.5, this code was simplified as part of MDEV-22871.

            marko Marko Mäkelä added a comment - Sorry, I was mistakenly looking at btr0sea.cc:1472 (the caller was btr0btr.cc:1472 ). The line btr0sea.cc:1248 is the last line of the following statement: ha_remove_all_nodes_to_page( btr_search_sys->hash_tables[ahi_slot], folds[i], page); It still remains to be resolved which mutex we would be acquiring here. In 10.5, this code was simplified as part of MDEV-22871 .

            From stack trace, it looks like the first 2 frames are corrupted because line number is so big.
            I would like to verify it. Can we get the following details:

            p *(PolicyMutex<TTASEventMutex<GenericPolicy> > * const) 0x2aca40d34000
            p &lock_sys
            

            thiru Thirunarayanan Balathandayuthapani added a comment - - edited From stack trace, it looks like the first 2 frames are corrupted because line number is so big. I would like to verify it. Can we get the following details: p *(PolicyMutex<TTASEventMutex<GenericPolicy> > * const) 0x2aca40d34000 p &lock_sys

            The crash occurs in ha_delete_hash_node() when the adaptive hash index entries for a page are being freed.

            Without installing the debug symbols (I’m not using a rpm-based GNU/Linux distribution), it seems that the crash occurs in the final mem_heap_free_top() call in the macro HASH_DELETE_AND_COMPACT():

               0x0000000000c6f0df <+159>:	cmp    %r8,%rcx
               0x0000000000c6f0e2 <+162>:	jne    0xc6f0f3 <_Z19ha_delete_hash_nodeP12hash_table_tP9ha_node_t+179>
               0x0000000000c6f0e4 <+164>:	jmpq   0xc6f18b <_Z19ha_delete_hash_nodeP12hash_table_tP9ha_node_t+331>
               0x0000000000c6f0e9 <+169>:	nopl   0x0(%rax)
               0x0000000000c6f0f0 <+176>:	mov    %rdx,%rcx
               0x0000000000c6f0f3 <+179>:	mov    0x8(%rcx),%rdx 
            

            Without having the debugging symbols, I am guessing that the inlined invocation of mem_heap_free_top() is attempting to compute the following:

            	block = UT_LIST_GET_LAST(heap->base);
            

            The byte offset 8 could be heap->base.end. A few instructions later we are applying other offsets, such as 0x30 or 0x48, which probably correspond to mem_block_info_t::len and mem_block_info_t::free, which ought to be 0x18 bytes apart.

            The cause could be a hardware failure (the memory is spontaneously corrupting data) or a rogue write somewhere in the server process, possibly directly related to the adaptive hash index. We have a rather similar crash report in MDEV-18815, and we have an open bug MDEV-20203, which to my knowledge has not been repeated lately but might still be repeatable. In MariaDB Server 10.5, MDEV-20487 disabled the adaptive hash index by default due to performance fluctuations and the difficulty of reproducing this kind of failures. But, since then we did fix some race conditions between buffer pool resizing and the adaptive hash index.

            marko Marko Mäkelä added a comment - The crash occurs in ha_delete_hash_node() when the adaptive hash index entries for a page are being freed. Without installing the debug symbols (I’m not using a rpm -based GNU/Linux distribution), it seems that the crash occurs in the final mem_heap_free_top() call in the macro HASH_DELETE_AND_COMPACT() : 0x0000000000c6f0df <+159>: cmp %r8,%rcx 0x0000000000c6f0e2 <+162>: jne 0xc6f0f3 <_Z19ha_delete_hash_nodeP12hash_table_tP9ha_node_t+179> 0x0000000000c6f0e4 <+164>: jmpq 0xc6f18b <_Z19ha_delete_hash_nodeP12hash_table_tP9ha_node_t+331> 0x0000000000c6f0e9 <+169>: nopl 0x0(%rax) 0x0000000000c6f0f0 <+176>: mov %rdx,%rcx 0x0000000000c6f0f3 <+179>: mov 0x8(%rcx),%rdx Without having the debugging symbols, I am guessing that the inlined invocation of mem_heap_free_top() is attempting to compute the following: block = UT_LIST_GET_LAST(heap->base); The byte offset 8 could be heap->base.end . A few instructions later we are applying other offsets, such as 0x30 or 0x48, which probably correspond to mem_block_info_t::len and mem_block_info_t::free , which ought to be 0x18 bytes apart. The cause could be a hardware failure (the memory is spontaneously corrupting data) or a rogue write somewhere in the server process, possibly directly related to the adaptive hash index. We have a rather similar crash report in MDEV-18815 , and we have an open bug MDEV-20203 , which to my knowledge has not been repeated lately but might still be repeatable. In MariaDB Server 10.5, MDEV-20487 disabled the adaptive hash index by default due to performance fluctuations and the difficulty of reproducing this kind of failures. But, since then we did fix some race conditions between buffer pool resizing and the adaptive hash index.
            juan.vera Juan added a comment -

            Here is the correct gdb output from the customer:

            # gdb /sbin/mysqld /mariadb/data/core.5309
            ....
            (gdb) p ((mem_heap_t*)0)->free
             
            Cannot access memory at address 0x48
             
            (gdb) p ((mem_heap_t*)0)->base.end
             
            Cannot access memory at address 0x10
            

            juan.vera Juan added a comment - Here is the correct gdb output from the customer: # gdb /sbin/mysqld /mariadb/data/core.5309 .... (gdb) p ((mem_heap_t*)0)->free   Cannot access memory at address 0x48   (gdb) p ((mem_heap_t*)0)->base.end   Cannot access memory at address 0x10

            The offset 0x10 does not match my guess about the crash being located in the inlined mem_heap_free_top() invocation. My revised guess would be the following assignment in the macro HASH_DELETE_AND_COMPACT:

            do {\
            	TYPE*		node111;\
            …
            			node111->NAME = NODE;\
            

            For the only expansion of this macro in ha_delete_hash_node(), this should correspond to the following:

            do {
            	ha_node_t*		node111;
            …
            			node111->next = del_node;
            

            The offset of 8 bytes matches the type declaration, because sizeof(ulint)==8 on this instruction set architecture:

            struct ha_node_t {
            	ulint		fold;	/*!< fold value for the data */
            	ha_node_t*	next;	/*!< next chain node or NULL if none */
            

            My conclusion from my previous comment remains valid: such memory corruption could be caused by hardware failure or a hard-to-reproduce bug, possibly in the adaptive hash index. Some race conditions between buffer pool resizing and the adaptive hash index have been fixed since the 10.4.13 release.

            I would advise to test the memory subsystem (for example, sudo memtester … does not even require a reboot) or to disable the subsystem where the crash occurred:

            SET GLOBAL innodb_adaptive_hash_index=OFF;
            

            marko Marko Mäkelä added a comment - The offset 0x10 does not match my guess about the crash being located in the inlined mem_heap_free_top() invocation. My revised guess would be the following assignment in the macro HASH_DELETE_AND_COMPACT : do {\ TYPE* node111;\ … node111->NAME = NODE;\ For the only expansion of this macro in ha_delete_hash_node() , this should correspond to the following: do { ha_node_t* node111; … node111->next = del_node; The offset of 8 bytes matches the type declaration, because sizeof(ulint)==8 on this instruction set architecture: struct ha_node_t { ulint fold; /*!< fold value for the data */ ha_node_t* next; /*!< next chain node or NULL if none */ My conclusion from my previous comment remains valid: such memory corruption could be caused by hardware failure or a hard-to-reproduce bug, possibly in the adaptive hash index. Some race conditions between buffer pool resizing and the adaptive hash index have been fixed since the 10.4.13 release. I would advise to test the memory subsystem (for example, sudo memtester … does not even require a reboot) or to disable the subsystem where the crash occurred: SET GLOBAL innodb_adaptive_hash_index= OFF ;

            Because MDEV-10814 causes the buffer pool to be omitted from the core dump by default, I do not think that we can get much more information from the core dump file.

            I do not even know whether it would be possible to extract the value of the problematic pointer. It would seem to depend on whether the signal handler is saving the values of all registers, and whether the debugger would ‘restore’ the register values when traversing the stack frames. I know that GDB does that for the stack pointer and the frame pointer, so it might be possible. In any case, because the hash table for the adaptive hash index is likely allocated from the buffer pool, we would not be much wiser. We might know whether the cause was a null pointer, but not much more. We do know that the SIGSEGV was triggered at the following instruction:

            0xc6f0f3 <+179>:	mov    0x8(%rcx),%rdx 
            

            Can we get the following output from GDB?

            faas p (void*)$rcx
            

            I am asking to dump this register in several stack frames, so that we can assess whether the register value in the interesting stack frame could have been properly saved and restored by the debugger.

            marko Marko Mäkelä added a comment - Because MDEV-10814 causes the buffer pool to be omitted from the core dump by default, I do not think that we can get much more information from the core dump file. I do not even know whether it would be possible to extract the value of the problematic pointer. It would seem to depend on whether the signal handler is saving the values of all registers, and whether the debugger would ‘restore’ the register values when traversing the stack frames. I know that GDB does that for the stack pointer and the frame pointer, so it might be possible. In any case, because the hash table for the adaptive hash index is likely allocated from the buffer pool, we would not be much wiser. We might know whether the cause was a null pointer, but not much more. We do know that the SIGSEGV was triggered at the following instruction: 0xc6f0f3 <+179>: mov 0x8(%rcx),%rdx Can we get the following output from GDB? faas p (void*)$rcx I am asking to dump this register in several stack frames, so that we can assess whether the register value in the interesting stack frame could have been properly saved and restored by the debugger.

            People

              juan.vera Juan
              valerii Valerii Kravchuk
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.