Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36195

MariaDB 10.6.18/11.4.4 signal 11 on include/rem0rec.h:605

Details

    Description

      Have not seen this in the wild, same backtrace for 11.4.4. We suspect related to addition of POINT column/secondary index.

      After rebuilding the table with POINT column/index, crashes went away.

      1: 250211 22:01:27 [ERROR] mysqld got signal 11 ;
      2: Sorry, we probably made a mistake, and this is a bug.
      3: 
      4: Your assistance in bug reporting will enable us to fix this for the next release.
      5: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
      6: 
      7: We will try our best to scrape up some info that will hopefully help
      8: diagnose the problem, but since we have already crashed, 
      9: something is definitely wrong and this may fail.
      10: 
      11: Server version: 10.6.18-MariaDB-log source revision: 
      12: key_buffer_size=16777216
      13: read_buffer_size=262144
      14: max_used_connections=3
      15: max_threads=1313
      16: thread_count=3
      17: It is possible that mysqld could use up to 
      18: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3074192 K  bytes of memory
      19: Hope that's ok; if not, decrease some variables in the equation.
      20: 
      21: Thread pointer: 0x4000390f9858
      22: Attempting backtrace. You can use the following information to find out
      23: where mysqld died. If you see no messages after this, something went
      24: terribly wrong...
      25: stack_bottom = 0x40003de08f28 thread_stack 0x40000
      26: mysys/stacktrace.c:215(my_print_stacktrace)[0xaaaae118c14c]
      27: sql/signal_handler.cc:235(handle_fatal_signal)[0xaaaae0a1b710]
      28: addr2line: 'linux-vdso.so.1': No such file
      29: linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x40002c17d850]
      30: include/rem0rec.h:605(rec_offs_n_fields(unsigned short const*))[0xaaaae101e6e0]
      31: btr/btr0btr.cc:3758(btr_compress(btr_cur_t*, bool, mtr_t*))[0xaaaae10245a8]
      32: btr/btr0cur.cc:4517(btr_cur_compress_if_useful(btr_cur_t*, bool, mtr_t*))[0xaaaae1035628]
      33: btr/btr0cur.cc:4977(btr_cur_pessimistic_delete(dberr_t*, unsigned long, btr_cur_t*, unsigned long, bool, mtr_t*))[0xaaaae103b67c]
      34: row/row0purge.cc:417(row_purge_remove_sec_if_poss_tree(purge_node_t*, dict_index_t*, dtuple_t const*))[0xaaaae0fc5bac]
      35: row/row0purge.cc:585(row_purge_remove_sec_if_poss)[0xaaaae0fc5ee4]
      36: row/row0purge.cc:1188(row_purge)[0xaaaae0fc68a8]
      37: que/que0que.cc:588(que_thr_step)[0xaaaae0f8de88]
      38: psi/mysql_thread.h:745(inline_mysql_mutex_lock)[0xaaaae0fe70cc]
      39: tpool/task_group.cc:56(tpool::task_group::execute(tpool::task*))[0xaaaae1100b54]
      40: tpool/tpool_generic.cc:581(tpool::thread_pool_generic::worker_main(tpool::worker_data*))[0xaaaae10feb64]
      41: bits/unique_ptr.h:78(std::default_delete<std::thread::_State>::operator()(std::thread::_State*) const)[0xaaaae126ae3c]
      42: /lib64/libpthread.so.0(+0x7230)[0x40002c358230]
      43: /lib64/libc.so.6(+0xdb7dc)[0x40002c4617dc]
      

      Attachments

        Issue Links

          Activity

            With GIS in the picture, it can be related to MDEV-27675, and there are a couple more open issues (but MDEV-27675 has a test case which still fails).
            marko, do you think it's close enough?

            elenst Elena Stepanova added a comment - With GIS in the picture, it can be related to MDEV-27675 , and there are a couple more open issues (but MDEV-27675 has a test case which still fails). marko , do you think it's close enough?

            Yes, this crash looks similar to the one reported for 10.6 in MDEV-27675: rtr_page_get_father_block() returns nullptr, which will be dereferenced, triggering SIGSEGV.

            The title of MDEV-27675 mentions both index corruption and assertion failure, while the title of this report only mentions a crash. It would be rather straightforward to fix the debug assertion failure or crash, but fixing the root cause of the corruption could require more fundamental changes, such as redesigning the locking, which we know to be broken (MDEV-15275, MDEV-15284, MDEV-26123).

            I would not say that this report is a duplicate of MDEV-27675. This one is primarily about treating the consequences of the corruption, while MDEV-27675 is also about the corruption itself.

            marko Marko Mäkelä added a comment - Yes, this crash looks similar to the one reported for 10.6 in MDEV-27675 : rtr_page_get_father_block() returns nullptr , which will be dereferenced, triggering SIGSEGV. The title of MDEV-27675 mentions both index corruption and assertion failure, while the title of this report only mentions a crash. It would be rather straightforward to fix the debug assertion failure or crash, but fixing the root cause of the corruption could require more fundamental changes, such as redesigning the locking, which we know to be broken ( MDEV-15275 , MDEV-15284 , MDEV-26123 ). I would not say that this report is a duplicate of MDEV-27675 . This one is primarily about treating the consequences of the corruption, while MDEV-27675 is also about the corruption itself.

            People

              marko Marko Mäkelä
              dotmanila Jervin R
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.