Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24279

Segfault after 1 day and 5 minutes uptime

Details

    Description

      Operating system: CentOS 7
      Yum repository: http://yum.mariadb.org/10.3/centos7-amd64
      CPU & mem: 2x Intel Gold 6128 CPU, 256G memory

      After upgrading from 10.3.21 to 10.3.27 mariadb started crashing after around one day of uptime. After a few crashes a pattern emerged: it would crash after exactly 1 day and 5 minutes

      2020-11-16 11:13:53 0 [Note] /usr/sbin/mysqld: ready for connections.
      2020-11-17 11:18:54 [ERROR] mysqld got signal 11 ;
       
      2020-11-17 11:19:28 0 [Note] /usr/sbin/mysqld: ready for connections.
      2020-11-18 11:24:28 [ERROR] mysqld got signal 11 ;
       
      2020-11-18 11:25:01 0 [Note] /usr/sbin/mysqld: ready for connections.
      2020-11-19 11:30:01 [ERROR] mysqld got signal 11 ;
       
      2020-11-19 11:30:36 0 [Note] /usr/sbin/mysqld: ready for connections.
      2020-11-20 11:35:36 [ERROR] mysqld got signal 11 ;
       
      2020-11-20 11:36:22 0 [Note] /usr/sbin/mysqld: ready for connections.
      2020-11-21 11:41:23 [ERROR] mysqld got signal 11 ;
       
      2020-11-21 11:42:12 0 [Note] /usr/sbin/mysqld: ready for connections.
      2020-11-22 11:47:12 [ERROR] mysqld got signal 11 ;
       
      2020-11-22 11:47:58 0 [Note] /usr/sbin/mysqld: ready for connections.
      2020-11-23 11:52:58 [ERROR] mysqld got signal 11 ;
       
      2020-11-24  8:59:17 0 [Note] /usr/sbin/mysqld: ready for connections.
      2020-11-25  9:04:17 [ERROR] mysqld got signal 11 ;
      

      The stacktrace is the same for every segfault:

      Thread pointer: 0x7f636c0eae58
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f6391b1a080 thread_stack 0x49000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55e20502266e]
      /usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x55e204ab9e7f]
      /lib64/libpthread.so.0(+0xf630)[0x7f857b611630]
      /usr/sbin/mysqld(_Z25schema_table_store_recordP3THDP5TABLE+0x39)[0x55e2049421a9]
      /usr/sbin/mysqld(+0x63c2fa)[0x55e2049462fa]
      /usr/sbin/mysqld(_Z14fill_variablesP3THDP10TABLE_LISTP4Item+0x126)[0x55e204949706]
      /usr/sbin/mysqld(+0xcd6fd8)[0x55e204fe0fd8]
      /usr/sbin/mysqld(+0x4f4bc9)[0x55e2047febc9]
      /usr/sbin/mysqld(+0xcd717a)[0x55e204fe117a]
      pthread_create.c:0(start_thread)[0x7f857b609ea5]
      /lib64/libc.so.6(clone+0x6d)[0x7f85799a996d]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x0): (null)
      Connection ID (thread ID): 6
      Status: NOT_KILLED
      

      Seeing the low connection id it looks like a system thread.

      The full latest crash log

      201125  9:04:17 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.3.27-MariaDB
      key_buffer_size=1610612736
      read_buffer_size=4194304
      max_used_connections=95
      max_threads=452
      thread_count=73
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 7137201 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x7f636c0eae58
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f6391b1a080 thread_stack 0x49000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55e20502266e]
      /usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x55e204ab9e7f]
      /lib64/libpthread.so.0(+0xf630)[0x7f857b611630]
      /usr/sbin/mysqld(_Z25schema_table_store_recordP3THDP5TABLE+0x39)[0x55e2049421a9]
      /usr/sbin/mysqld(+0x63c2fa)[0x55e2049462fa]
      /usr/sbin/mysqld(_Z14fill_variablesP3THDP10TABLE_LISTP4Item+0x126)[0x55e204949706]
      /usr/sbin/mysqld(+0xcd6fd8)[0x55e204fe0fd8]
      /usr/sbin/mysqld(+0x4f4bc9)[0x55e2047febc9]
      /usr/sbin/mysqld(+0xcd717a)[0x55e204fe117a]
      pthread_create.c:0(start_thread)[0x7f857b609ea5]
      /lib64/libc.so.6(clone+0x6d)[0x7f85799a996d]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x0): (null)
      Connection ID (thread ID): 6
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
       
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      information that should help you find out what is causing the crash.
       
      We think the query pointer is invalid, but we will try to print it anyway. 
      Query: 
       
      Writing a core file...
      Working directory at /var/lib/mysql
      Resource Limits:
      Limit                     Soft Limit           Hard Limit           Units     
      Max cpu time              unlimited            unlimited            seconds   
      Max file size             unlimited            unlimited            bytes     
      Max data size             unlimited            unlimited            bytes     
      Max stack size            8388608              unlimited            bytes     
      Max core file size        0                    unlimited            bytes     
      Max resident set          unlimited            unlimited            bytes     
      Max processes             1028777              1028777              processes 
      Max open files            16384                16384                files     
      Max locked memory         65536                65536                bytes     
      Max address space         unlimited            unlimited            bytes     
      Max file locks            unlimited            unlimited            locks     
      Max pending signals       1028777              1028777              signals   
      Max msgqueue size         819200               819200               bytes     
      Max nice priority         0                    0                    
      Max realtime priority     0                    0                    
      Max realtime timeout      unlimited            unlimited            us        
      Core pattern: core
      
      

      Attachments

        Issue Links

          Activity

            Do you have feedback plugin enabled?
            If so, please try to disable it and see if it helps.

            elenst Elena Stepanova added a comment - Do you have feedback plugin enabled? If so, please try to disable it and see if it helps.
            kees Kees Hoekzema added a comment - - edited

            The feedback plugin is indeed loaded, i will disable it for now and wait at least a day (well, two as i need to wait for offhours to restart the server, `uninstall plugin` doesn't work)

            kees Kees Hoekzema added a comment - - edited The feedback plugin is indeed loaded, i will disable it for now and wait at least a day (well, two as i need to wait for offhours to restart the server, `uninstall plugin` doesn't work)
            elenst Elena Stepanova added a comment - - edited

            The failure appeared in 10.3 after this commit:

            commit e64084d5a3a72462fa6263d1d0a86e72c0ba0d47
            Author: Sergei Golubchik
            Date:   Sat Aug 1 13:12:50 2020 +0200
             
                MDEV-21201 No records produced in information_schema query, depending on projection
            

            10.5 657fcdf430

            #3  <signal handler called>
            #4  0x000055863b460f88 in heap_write (info=0x0, record=0x7f976804ce80 "\377\023") at /data/src/10.5-bug/storage/heap/hp_write.c:37
            #5  0x000055863b459b3c in ha_heap::write_row (this=0x7f9768044020, buf=0x7f976804ce80 "\377\023") at /data/src/10.5-bug/storage/heap/ha_heap.cc:239
            #6  0x000055863ad2d0ee in handler::ha_write_tmp_row (this=0x7f9768044020, buf=0x7f976804ce80 "\377\023") at /data/src/10.5-bug/sql/sql_class.h:7029
            #7  0x000055863ad409bb in schema_table_store_record (thd=0x7f9768031b48, table=0x7f9768030250) at /data/src/10.5-bug/sql/sql_show.cc:3868
            #8  0x000055863ad40734 in show_status_array (thd=0x7f9768031b48, wild=0x0, variables=0x7f976805bdd0, scope=SHOW_OPT_GLOBAL, status_var=0x0, prefix=0x55863ba734a8 "", table=0x7f9768030250, ucase_names=true, cond=0x7f9768057a08) at /data/src/10.5-bug/sql/sql_show.cc:3787
            #9  0x000055863ad5246c in fill_variables (thd=0x7f9768031b48, tables=0x7f977dbf27a0, cond=0x7f97680385f8) at /data/src/10.5-bug/sql/sql_show.cc:7826
            #10 0x000055863b9d18cb in feedback::fill_feedback (thd=0x7f9768031b48, tables=0x7f977dbf27a0, unused=0x0) at /data/src/10.5-bug/plugin/feedback/feedback.cc:215
            #11 0x000055863b9d302f in feedback::send_report (when=0x0) at /data/src/10.5-bug/plugin/feedback/sender_thread.cc:211
            #12 0x000055863b9d3431 in feedback::background_thread (arg=0x0) at /data/src/10.5-bug/plugin/feedback/sender_thread.cc:282
            #13 0x00007f97a319e609 in start_thread (arg=<optimized out>) at pthread_create.c:477
            #14 0x00007f97a2d72293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
            

            Start the server with --feedback --feedback_debug_first_interval=5 --feedback_debug_startup_interval=5, all other defaults.

            elenst Elena Stepanova added a comment - - edited The failure appeared in 10.3 after this commit: commit e64084d5a3a72462fa6263d1d0a86e72c0ba0d47 Author: Sergei Golubchik Date: Sat Aug 1 13:12:50 2020 +0200   MDEV-21201 No records produced in information_schema query, depending on projection 10.5 657fcdf430 #3 <signal handler called> #4 0x000055863b460f88 in heap_write (info=0x0, record=0x7f976804ce80 "\377\023") at /data/src/10.5-bug/storage/heap/hp_write.c:37 #5 0x000055863b459b3c in ha_heap::write_row (this=0x7f9768044020, buf=0x7f976804ce80 "\377\023") at /data/src/10.5-bug/storage/heap/ha_heap.cc:239 #6 0x000055863ad2d0ee in handler::ha_write_tmp_row (this=0x7f9768044020, buf=0x7f976804ce80 "\377\023") at /data/src/10.5-bug/sql/sql_class.h:7029 #7 0x000055863ad409bb in schema_table_store_record (thd=0x7f9768031b48, table=0x7f9768030250) at /data/src/10.5-bug/sql/sql_show.cc:3868 #8 0x000055863ad40734 in show_status_array (thd=0x7f9768031b48, wild=0x0, variables=0x7f976805bdd0, scope=SHOW_OPT_GLOBAL, status_var=0x0, prefix=0x55863ba734a8 "", table=0x7f9768030250, ucase_names=true, cond=0x7f9768057a08) at /data/src/10.5-bug/sql/sql_show.cc:3787 #9 0x000055863ad5246c in fill_variables (thd=0x7f9768031b48, tables=0x7f977dbf27a0, cond=0x7f97680385f8) at /data/src/10.5-bug/sql/sql_show.cc:7826 #10 0x000055863b9d18cb in feedback::fill_feedback (thd=0x7f9768031b48, tables=0x7f977dbf27a0, unused=0x0) at /data/src/10.5-bug/plugin/feedback/feedback.cc:215 #11 0x000055863b9d302f in feedback::send_report (when=0x0) at /data/src/10.5-bug/plugin/feedback/sender_thread.cc:211 #12 0x000055863b9d3431 in feedback::background_thread (arg=0x0) at /data/src/10.5-bug/plugin/feedback/sender_thread.cc:282 #13 0x00007f97a319e609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #14 0x00007f97a2d72293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Start the server with --feedback --feedback_debug_first_interval=5 --feedback_debug_startup_interval=5 , all other defaults.

            Hello, I can confirm the crash is gone once disabled the feedback plugin via config (feedback=OFF) and restarted mariadb, waited one day to check. I can confirm my duplicated opened ticket: MDEV-24315

            jgcovas Juan Gabriel Covas added a comment - Hello, I can confirm the crash is gone once disabled the feedback plugin via config (feedback=OFF) and restarted mariadb, waited one day to check. I can confirm my duplicated opened ticket: MDEV-24315

            People

              serg Sergei Golubchik
              kees Kees Hoekzema
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.