Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8108

MariaDB 5.5.43 segfaults on startup

Details

    Description

      mysqld segfaults on startup.
      Stack trace below.
      It then restarts through mysqld_safe, and after that appears to be ok-ish with clients able to connect. But it's not in good shape. There may be InnoDB tables with corruption in this installation, which could be a possible trigger for the crash - however, a segfault is always a bug. An specific assert from InnoDB would be (although still annoying) more useful as then more context is shown.

      150506 13:41:19 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
      150506 13:41:19 [Note] /usr/sbin/mysqld (mysqld 5.5.43-MariaDB-1~trusty) starting as process 19351 ...
      150506 13:41:19 InnoDB: The InnoDB memory heap is disabled
      150506 13:41:19 InnoDB: Mutexes and rw_locks use GCC atomic builtins
      150506 13:41:19 InnoDB: Compressed tables use zlib 1.2.8
      150506 13:41:19 InnoDB: Using Linux native AIO
      150506 13:41:19 InnoDB: Initializing buffer pool, size = 16.0G
      150506 13:41:19 InnoDB: Completed initialization of buffer pool
      150506 13:41:19 InnoDB: highest supported file format is Barracuda.
      150506 13:41:19  InnoDB: Waiting for the background threads to start
      150506 13:41:20 Percona XtraDB (http://www.percona.com) 5.5.42-MariaDB-37.1 started; log sequence number 74624717726974
      150506 13:41:20 [Note] Plugin 'FEEDBACK' is disabled.
      150506 13:41:20 [Note] Server socket created on IP: '0.0.0.0'.
      150506 13:41:20 [Note] Event Scheduler: Loaded 0 events
      150506 13:41:20 [Note] /usr/sbin/mysqld: ready for connections.
      Version: '5.5.43-MariaDB-1~trusty'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
      150506 13:41:20 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see http://kb.askmonty.org/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 5.5.43-MariaDB-1~trusty
      key_buffer_size=16777216
      read_buffer_size=131072
      max_used_connections=1
      max_threads=153
      thread_count=1
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352025 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0x7f0661bce000
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f0ab8ec9e40 thread_stack 0x30000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f0ab981dcee]
      /usr/sbin/mysqld(handle_fatal_signal+0x457)[0x7f0ab940ed77]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f0ab84bc340]
      /usr/sbin/mysqld(+0x66888d)[0x7f0ab95ad88d]
      /usr/sbin/mysqld(+0x675c6d)[0x7f0ab95bac6d]
      /usr/sbin/mysqld(+0x4560df)[0x7f0ab939b0df]
      /usr/sbin/mysqld(_ZN22Repair_table_statement7executeEP3THD+0xcc)[0x7f0ab939b9fc]
      /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1970)[0x7f0ab92c85e0]
      /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x11a)[0x7f0ab92cd8ba]
      /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1896)[0x7f0ab92cf9e6]
      /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x1c3)[0x7f0ab938dd13]
      /usr/sbin/mysqld(handle_one_connection+0x4a)[0x7f0ab938de0a]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f0ab84b4182]
      /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f0ab759e47d]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f064841d018): is an invalid pointer
      Connection ID (thread ID): 6
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off
       
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      150506 13:41:20 mysqld_safe Number of processes running now: 0
      150506 13:41:20 mysqld_safe mysqld restarted

      Attachments

        Issue Links

          Activity

            Hi Arjen,

            Do you have a datadir backup on which the crash can be reproduced?
            If you do, can you share it with us or, alternatively, run a debug server on it to produce a decent stack trace and all that?

            elenst Elena Stepanova added a comment - Hi Arjen, Do you have a datadir backup on which the crash can be reproduced? If you do, can you share it with us or, alternatively, run a debug server on it to produce a decent stack trace and all that?
            arjen Arjen Lentz added a comment -

            Hi Elena

            No sorry. Unfortunately, time constraints and available infra at that time didn't allow us to create extra snapshots.
            I thought the stacktrace was pretty decent - one problem being of course that InnoDB is not being informative at all, we only deduce indirectly that that's where it happened.
            InnoDB has long a very bad habit of blowing up on corruption issues, the best cases being asserts but otherwise just segfaulting. It's not shiny.

            arjen Arjen Lentz added a comment - Hi Elena No sorry. Unfortunately, time constraints and available infra at that time didn't allow us to create extra snapshots. I thought the stacktrace was pretty decent - one problem being of course that InnoDB is not being informative at all, we only deduce indirectly that that's where it happened. InnoDB has long a very bad habit of blowing up on corruption issues, the best cases being asserts but otherwise just segfaulting. It's not shiny.
            danblack Daniel Black added a comment - - edited

            If debug packages where provided it would be quite easy to reverse this back to code (MDEV-658).

            Pulling the mysqld out of the deb package, doing an objdump -d mysqld, finding +0xcc from the start of _ZN22Repair_table_statement7executeEP3THD (
            Repair_table_statement::execute(THD*) from (echo _ZN22Repair_table_statement7executeEP3THD | demangle ) ). leaves this bit of code.

            The segfault therefore happened at 0x46083C

            0000000000460770 <_ZN22Repair_table_statement7executeEP3THD>:
            ...
              460825: 4c 89 e6              mov    %r12,%rsi
              460828: 48 89 4c 24 10        mov    %rcx,0x10(%rsp)
              46082d: 48 8d 0d 95 53 59 00  lea    0x595395(%rip),%rcx        # 9f5bc9 <_ZN18Interruptible_wait20m_interrupt_intervalE+0x3f829>
              460834: 41 0f 95 85 a7 2f 00  setne  0x2fa7(%r13)
              46083b: 00
              46083c: 48 8b 43 08           mov    0x8(%rbx),%rax
              460840: 48 8d 90 a0 0e 00 00  lea    0xea0(%rax),%rdx
              460847: 8b 80 a4 0e 00 00     mov    0xea4(%rax),%eax
              46084d: 83 e0 01              and    $0x1,%eax
              460850: 89 04 24              mov    %eax,(%rsp)
              460853: e8 d8 e3 ff ff        callq  45ec30 <_ZN15MDL_object_lockD1Ev+0x970>
              460858: 84 c0                 test   %al,%al
              46085a: 75 2a                 jne    460886 <_ZN22Repair_table_statement7executeEP3THD+0x116>
              46085c: 48 8b 53 08           mov    0x8(%rbx),%rdx
              460860: 80 ba 6c 11 00 00 00  cmpb   $0x0,0x116c(%rdx)
            ...

            So it happened just before the call to MDL_object_lock::~MDL_object_lock +0x970 which looks really odd to me because the destructor of MDL_object_lock is pretty empty. I suspect since mysql_admin_table doesn't show up in the disassemble this might be the call (its a static method) and it appears quite a few other times.

            What is clear is its after the call to check_table_access and before if (!res && !m_lex->no_write_to_binlog) in Repair_table_statement::execute which would make it before the any innodb code at all.

            Looking at {{ 46083c: 48 8b 43 08 mov 0x8(%rbx),%rax}} we see earlier in the code 4607d0: 48 8b 05 79 27 b5 00 mov 0xb52779(%rip),%rax # fb2f50 <my_charset_utf8mb4_handler+0x5670>

            Looking at the fb2f50

            $ readelf -a usr/sbin/mysqld | grep fb2f50
            000000fb2f50  000000000008 R_X86_64_RELATIVE                    ff6f80
             
            $ readelf -a usr/sbin/mysqld | grep ff6f80
            000000fb2f50  000000000008 R_X86_64_RELATIVE                    ff6f80
            000000fb6198  03f100000001 R_X86_64_64       0000000000ff6f80 opt_log_slow_admin_sta + 0
            000000fb61a0  03f100000001 R_X86_64_64       0000000000ff6f80 opt_log_slow_admin_sta + 0
              1009: 0000000000ff6f80     1 OBJECT  GLOBAL DEFAULT   29 opt_log_slow_admin_state

            Which kind of puts the blame on:

            thd->enable_slow_log= opt_log_slow_admin_statements;

            We also see rbx is loaded from the stack which would point to the argument THD which kind of confirms the like of code.

            So thd not right? seem odd if its survived a check_table_access call. Anyway that's my few minutes worth of thoughts.

            danblack Daniel Black added a comment - - edited If debug packages where provided it would be quite easy to reverse this back to code ( MDEV-658 ). Pulling the mysqld out of the deb package, doing an objdump -d mysqld, finding +0xcc from the start of _ZN22Repair_table_statement7executeEP3THD ( Repair_table_statement::execute(THD*) from (echo _ZN22Repair_table_statement7executeEP3THD | demangle ) ). leaves this bit of code. The segfault therefore happened at 0x46083C 0000000000460770 <_ZN22Repair_table_statement7executeEP3THD>: ... 460825: 4c 89 e6 mov %r12,%rsi 460828: 48 89 4c 24 10 mov %rcx,0x10(%rsp) 46082d: 48 8d 0d 95 53 59 00 lea 0x595395(%rip),%rcx # 9f5bc9 <_ZN18Interruptible_wait20m_interrupt_intervalE+0x3f829> 460834: 41 0f 95 85 a7 2f 00 setne 0x2fa7(%r13) 46083b: 00 46083c: 48 8b 43 08 mov 0x8(%rbx),%rax 460840: 48 8d 90 a0 0e 00 00 lea 0xea0(%rax),%rdx 460847: 8b 80 a4 0e 00 00 mov 0xea4(%rax),%eax 46084d: 83 e0 01 and $0x1,%eax 460850: 89 04 24 mov %eax,(%rsp) 460853: e8 d8 e3 ff ff callq 45ec30 <_ZN15MDL_object_lockD1Ev+0x970> 460858: 84 c0 test %al,%al 46085a: 75 2a jne 460886 <_ZN22Repair_table_statement7executeEP3THD+0x116> 46085c: 48 8b 53 08 mov 0x8(%rbx),%rdx 460860: 80 ba 6c 11 00 00 00 cmpb $0x0,0x116c(%rdx) ... So it happened just before the call to MDL_object_lock::~MDL_object_lock +0x970 which looks really odd to me because the destructor of MDL_object_lock is pretty empty. I suspect since mysql_admin_table doesn't show up in the disassemble this might be the call (its a static method) and it appears quite a few other times. What is clear is its after the call to check_table_access and before if (!res && !m_lex->no_write_to_binlog) in Repair_table_statement::execute which would make it before the any innodb code at all. Looking at {{ 46083c: 48 8b 43 08 mov 0x8(%rbx),%rax}} we see earlier in the code 4607d0: 48 8b 05 79 27 b5 00 mov 0xb52779(%rip),%rax # fb2f50 <my_charset_utf8mb4_handler+0x5670> Looking at the fb2f50 $ readelf -a usr/sbin/mysqld | grep fb2f50 000000fb2f50 000000000008 R_X86_64_RELATIVE ff6f80   $ readelf -a usr/sbin/mysqld | grep ff6f80 000000fb2f50 000000000008 R_X86_64_RELATIVE ff6f80 000000fb6198 03f100000001 R_X86_64_64 0000000000ff6f80 opt_log_slow_admin_sta + 0 000000fb61a0 03f100000001 R_X86_64_64 0000000000ff6f80 opt_log_slow_admin_sta + 0 1009: 0000000000ff6f80 1 OBJECT GLOBAL DEFAULT 29 opt_log_slow_admin_state Which kind of puts the blame on: thd->enable_slow_log= opt_log_slow_admin_statements; We also see rbx is loaded from the stack which would point to the argument THD which kind of confirms the like of code. So thd not right? seem odd if its survived a check_table_access call. Anyway that's my few minutes worth of thoughts.

            danblack, I've just downloaded mariadb-server-core-5.5_5.5.43+maria-1~trusty_amd64.deb, but disassembly there look different. where did you get yours from?

            serg Sergei Golubchik added a comment - danblack , I've just downloaded mariadb-server-core-5.5_5.5.43+maria-1~trusty_amd64.deb, but disassembly there look different. where did you get yours from?
            danblack Daniel Black added a comment -

            lucid my bad:

            This what you have?

            0x456930 + 0xcc = 0x4569FC is mov %eax,%edx. Odd. So taking the result of the mysql_admin_table before it puts it into res.

            456930 <_ZN22Repair_table_statement7executeEP3THD>:
            ..
              4569d4: 48 8b 43 08           mov    0x8(%rbx),%rax
              4569d8: 48 89 4c 24 18        mov    %rcx,0x18(%rsp)
              4569dd: 48 8d 0d 0f 2d 59 00  lea    0x592d0f(%rip),%rcx        # 9e96f3 <_ZN18Interruptible_wait20m_interrupt_intervalE+0x3e89b>
              4569e4: 48 8d 90 a0 0e 00 00  lea    0xea0(%rax),%rdx
              4569eb: 8b 80 a4 0e 00 00     mov    0xea4(%rax),%eax
              4569f1: 83 e0 01              and    $0x1,%eax
              4569f4: 89 04 24              mov    %eax,(%rsp)
              4569f7: e8 d4 e1 ff ff        callq  454bd0 <_ZN15MDL_object_lockD0Ev+0xa50>
              4569fc: 89 c2                 mov    %eax,%edx
              4569fe: 48 8b 43 08           mov    0x8(%rbx),%rax
              456a02: 84 d2                 test   %dl,%dl
              456a04: 75 2a                 jne    456a30 <_ZN22Repair_table_statement7executeEP3THD+0x100>
              456a06: 80 b8 6c 11 00 00 00  cmpb   $0x0,0x116c(%rax)
              456a0d: 75 21                 jne    456a30 <_ZN22Repair_table_statement7executeEP3THD+0x100>
              456a0f: 41 8b 4d 68           mov    0x68(%r13),%ecx
              456a13: 49 8b 55 60           mov    0x60(%r13),%rdx
              456a17: 45 31 c0              xor    %r8d,%r8d
              456a1a: be 01 00 00 00        mov    $0x1,%esi
              456a1f: 4c 89 ef              mov    %r13,%rdi
              456a22: e8 69 31 fa ff        callq  3f9b90 <_Z13write_bin_logP3THDbPKcmb>

            danblack Daniel Black added a comment - lucid my bad: This what you have? 0x456930 + 0xcc = 0x4569FC is mov %eax,%edx . Odd. So taking the result of the mysql_admin_table before it puts it into res. 456930 <_ZN22Repair_table_statement7executeEP3THD>: .. 4569d4: 48 8b 43 08 mov 0x8(%rbx),%rax 4569d8: 48 89 4c 24 18 mov %rcx,0x18(%rsp) 4569dd: 48 8d 0d 0f 2d 59 00 lea 0x592d0f(%rip),%rcx # 9e96f3 <_ZN18Interruptible_wait20m_interrupt_intervalE+0x3e89b> 4569e4: 48 8d 90 a0 0e 00 00 lea 0xea0(%rax),%rdx 4569eb: 8b 80 a4 0e 00 00 mov 0xea4(%rax),%eax 4569f1: 83 e0 01 and $0x1,%eax 4569f4: 89 04 24 mov %eax,(%rsp) 4569f7: e8 d4 e1 ff ff callq 454bd0 <_ZN15MDL_object_lockD0Ev+0xa50> 4569fc: 89 c2 mov %eax,%edx 4569fe: 48 8b 43 08 mov 0x8(%rbx),%rax 456a02: 84 d2 test %dl,%dl 456a04: 75 2a jne 456a30 <_ZN22Repair_table_statement7executeEP3THD+0x100> 456a06: 80 b8 6c 11 00 00 00 cmpb $0x0,0x116c(%rax) 456a0d: 75 21 jne 456a30 <_ZN22Repair_table_statement7executeEP3THD+0x100> 456a0f: 41 8b 4d 68 mov 0x68(%r13),%ecx 456a13: 49 8b 55 60 mov 0x60(%r13),%rdx 456a17: 45 31 c0 xor %r8d,%r8d 456a1a: be 01 00 00 00 mov $0x1,%esi 456a1f: 4c 89 ef mov %r13,%rdi 456a22: e8 69 31 fa ff callq 3f9b90 <_Z13write_bin_logP3THDbPKcmb>

            That's probably not "segfaults on startup" after all. There's a client connection that executes REPAIR. So that's probably mysql_upgrade (run automatically by Ubuntu?). And then it's MDEV-8115

            serg Sergei Golubchik added a comment - That's probably not "segfaults on startup" after all. There's a client connection that executes REPAIR. So that's probably mysql_upgrade (run automatically by Ubuntu?). And then it's MDEV-8115

            I'll assume it's MDEV-8115

            serg Sergei Golubchik added a comment - I'll assume it's MDEV-8115

            People

              serg Sergei Golubchik
              arjen Arjen Lentz
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.