Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28940

SEGV after upgrade to 10.8.3-MariaDB

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Incomplete
    • 10.8.3
    • N/A
    • None
    • Centos-release-7-9.2009.1.el7.centos.x86_64
      MariaDB 10.8.3

    Description

      I've run upgrade on my VPS server and i've got SEGV error and MariaDB won't start again.
      Here's the information from my server :

      mariadb.err :

      2022-06-24 17:46:04 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
      2022-06-24 17:46:04 0 [Note] InnoDB: Number of transaction pools: 1
      2022-06-24 17:46:04 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      2022-06-24 17:46:04 0 [Note] InnoDB: Using Linux native AIO
      2022-06-24 17:46:04 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB
      2022-06-24 17:46:04 0 [Note] InnoDB: Completed initialization of buffer pool
      2022-06-24 17:46:04 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
      2022-06-24 17:46:04 0 [Note] InnoDB: 128 rollback segments are active.
      2022-06-24 17:46:04 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
      2022-06-24 17:46:04 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...
      2022-06-24 17:46:04 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB.
      2022-06-24 17:46:04 0 [Note] InnoDB: log sequence number 1168229248; transaction id 1287509
      2022-06-24 17:46:04 0 [Note] Plugin 'FEEDBACK' is disabled.
      2022-06-24 17:46:04 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
      2022-06-24 17:46:04 0 [Note] Server socket created on IP: '127.0.0.1'.
      2022-06-24 17:46:04 0 [ERROR] mariadbd: Can't create/write to file '/var/run/mariadb/mariadb.pid' (Errcode: 2 "No such file or directory")
      2022-06-24 17:46:04 0 [ERROR] Can't start server: can't create PID file: No such file or directory
      220624 17:46:04 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.8.3-MariaDB
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=0
      max_threads=153
      thread_count=0
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467997 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      ??:0(my_print_stacktrace)[0x561b63b3822e]
      ??:0(handle_fatal_signal)[0x561b63597a77]
      sigaction.c:0(__restore_rt)[0x7f2af6165630]
      ??:0(std::__detail::_List_node_base::_M_unhook())[0x7f2af5cb913a]
      ??:0(void std::__introsort_loop<unsigned char**, long>(unsigned char**, unsigned char**, long))[0x561b6396d4b4]
      ??:0(void std::__introsort_loop<unsigned char**, long>(unsigned char**, unsigned char**, long))[0x561b6396dd64]
      ??:0(tpool::task_group::execute(tpool::task*))[0x561b63ac29d6]
      ??:0(tpool::thread_pool_generic::worker_main(tpool::worker_data*))[0x561b63ac1381]
      ??:0(std::this_thread::__sleep_for(std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000000000l> >))[0x7f2af5cff330]
      pthread_create.c:0(start_thread)[0x7f2af615dea5]
      ??:0(__clone)[0x7f2af5678b0d]
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      information that should help you find out what is causing the crash.
      Writing a core file...
      Working directory at /var/lib/mysql
      Resource Limits:
      Limit                     Soft Limit           Hard Limit           Units     
      Max cpu time              unlimited            unlimited            seconds   
      Max file size             unlimited            unlimited            bytes     
      Max data size             unlimited            unlimited            bytes     
      Max stack size            8388608              unlimited            bytes     
      Max core file size        0                    unlimited            bytes     
      Max resident set          unlimited            unlimited            bytes     
      Max processes             31202                31202                processes 
      Max open files            32768                32768                files     
      Max locked memory         65536                65536                bytes     
      Max address space         unlimited            unlimited            bytes     
      Max file locks            unlimited            unlimited            locks     
      Max pending signals       31202                31202                signals   
      Max msgqueue size         819200               819200               bytes     
      Max nice priority         0                    0                    
      Max realtime priority     0                    0                    
      Max realtime timeout      unlimited            unlimited            us        
      Core pattern: core
      

      journalctl -xe :

      – Unit mariadb.service has begun starting up.
      Jun 24 17:54:47 vmi813286.contaboserver.net mariadbd[14453]: 2022-06-24 17:54:47 0 [Note] /usr/sbin/mariadbd (server 10.8.3-MariaDB) starting as proc
      Jun 24 17:54:47 vmi813286.contaboserver.net kernel: mariadbd[14468]: segfault at 8 ip 00007f4a7d78913a sp 00007f4a49ffac08 error 6 in libstdc++.so.6.
      Jun 24 17:54:47 vmi813286.contaboserver.net systemd[1]: mariadb.service: main process exited, code=killed, status=11/SEGV
      Jun 24 17:54:47 vmi813286.contaboserver.net systemd[1]: Failed to start MariaDB 10.8.3 database server.
      – Subject: Unit mariadb.service has failed
      – Defined-By: systemd
      – Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
      –
      – Unit mariadb.service has failed.

      systemctl status mariadb.service :

      mariadb.service - MariaDB 10.8.3 database server
      Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
      Drop-In: /etc/systemd/system/mariadb.service.d
      └─migrated-from-my.cnf-settings.conf
      Active: failed (Result: exit-code) since Jum 2022-06-24 17:54:52 +08; 5min ago
      Docs: man:mariadbd(8)
      https://mariadb.com/kb/en/library/systemd/
      Process: 14619 ExecStart=/usr/sbin/mariadbd $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
      Process: 14591 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
      Process: 14589 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
      Main PID: 14619 (code=exited, status=1/FAILURE)

      Attachments

        Activity

          Solved by downgrading Centos 7 kernel, this is the affected kernel information from my vps :

          yum list kernel

          Loaded plugins: fastestmirror
          Loading mirror speeds from cached hostfile

          • base: asi-fs-s.contabo.net
          • epel: mirrors.thzhost.com
          • extras: mirror.vodien.com
          • remi-php74: mirrors.thzhost.com
          • remi-safe: mirrors.thzhost.com
          • updates: asi-fs-s.contabo.net
            Installed Packages
            kernel.x86_64 3.10.0-1160.el7 @anaconda
            kernel.x86_64 3.10.0-1160.59.1.el7 @updates
            kernel.x86_64 3.10.0-1160.62.1.el7 @updates
            kernel.x86_64 3.10.0-1160.66.1.el7 @updates
          semprul57 Muhammad Baqir added a comment - Solved by downgrading Centos 7 kernel, this is the affected kernel information from my vps : yum list kernel Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile base: asi-fs-s.contabo.net epel: mirrors.thzhost.com extras: mirror.vodien.com remi-php74: mirrors.thzhost.com remi-safe: mirrors.thzhost.com updates: asi-fs-s.contabo.net Installed Packages kernel.x86_64 3.10.0-1160.el7 @anaconda kernel.x86_64 3.10.0-1160.59.1.el7 @updates kernel.x86_64 3.10.0-1160.62.1.el7 @updates kernel.x86_64 3.10.0-1160.66.1.el7 @updates

          Your main problem is this error message:

          2022-06-24 17:46:04 0 [ERROR] mariadbd: Can't create/write to file '/var/run/mariadb/mariadb.pid' (Errcode: 2 "No such file or directory")
          

          May be you didn't have /var/run/mariadb/ or perhaps selinux or something didn't allow to write to it.

          The crash is clearly a bug that we need to fix, but it happened after the server failed to create a file and was aborting the startup. When we fix the crash, your server will (or would, on the new kernel) still fail to start with the same error.

          serg Sergei Golubchik added a comment - Your main problem is this error message: 2022-06-24 17:46:04 0 [ERROR] mariadbd: Can't create/write to file '/var/run/mariadb/mariadb.pid' (Errcode: 2 "No such file or directory") May be you didn't have /var/run/mariadb/ or perhaps selinux or something didn't allow to write to it. The crash is clearly a bug that we need to fix, but it happened after the server failed to create a file and was aborting the startup. When we fix the crash, your server will (or would, on the new kernel) still fail to start with the same error.

          wlad, please, take a look, may be you'll be able to see why such an early abort would cause the thread pool to crash.

          serg Sergei Golubchik added a comment - wlad , please, take a look, may be you'll be able to see why such an early abort would cause the thread pool to crash.
          wlad Vladislav Vaintroub added a comment - - edited

          serg, it is not an early startup, It is very late in the startup sequence, after the server is already accessible via network. Innodb recovery is probably already done at that stage.

          The stacktrace is partially damaged, tpool::task_group::execute would usually execute a callback function, which Innodb provides. It would definitely not execute any introsort_loop. If I try to start the server with invalid path in --pid-file, it ends with

          2022-07-04 21:30:43 0 [ERROR] mysqld: Can't create/write to file '/mnt/c/aaa/bla' (Errcode: 2 "No such file or directory")
          2022-07-04 21:30:43 0 [ERROR] Can't start server: can't create PID file: No such file or directory
          

          and no signs of crash.
          Thus , it would be helpful to get all threads stacktraces, as described https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/#analyzing-a-core-file-with-gdb-on-linux . If I interpret the error log snippet correctly ("Writing a core file...", core file was written , probably into /var/lib/mysql.
          semprul57, would it be possible to produce the "all threads stacktrace" from the core file , as described in https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/#analyzing-a-core-file-with-gdb-on-linux
          Thank you!

          wlad Vladislav Vaintroub added a comment - - edited serg , it is not an early startup, It is very late in the startup sequence, after the server is already accessible via network. Innodb recovery is probably already done at that stage. The stacktrace is partially damaged, tpool::task_group::execute would usually execute a callback function, which Innodb provides. It would definitely not execute any introsort_loop. If I try to start the server with invalid path in --pid-file, it ends with 2022-07-04 21:30:43 0 [ERROR] mysqld: Can't create/write to file '/mnt/c/aaa/bla' (Errcode: 2 "No such file or directory") 2022-07-04 21:30:43 0 [ERROR] Can't start server: can't create PID file: No such file or directory and no signs of crash. Thus , it would be helpful to get all threads stacktraces, as described https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/#analyzing-a-core-file-with-gdb-on-linux . If I interpret the error log snippet correctly ("Writing a core file...", core file was written , probably into /var/lib/mysql. semprul57 , would it be possible to produce the "all threads stacktrace" from the core file , as described in https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/#analyzing-a-core-file-with-gdb-on-linux Thank you!

          Sorry for late reply, this is production server and we couldn't afford downtime to reproduce this bug again.

          Is there any chance to reproduce full stack trace without downtime?

          semprul57 Muhammad Baqir added a comment - Sorry for late reply, this is production server and we couldn't afford downtime to reproduce this bug again. Is there any chance to reproduce full stack trace without downtime?
          wlad Vladislav Vaintroub added a comment - - edited

          Yes, there is a chance. There is probably a core file in /var/lib/mysql . If there is, then you just follow the instructions in https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/#analyzing-a-core-file-with-gdb-on-linux, that describe how to debug the core file, rather than the active process.

          wlad Vladislav Vaintroub added a comment - - edited Yes, there is a chance. There is probably a core file in /var/lib/mysql . If there is, then you just follow the instructions in https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/#analyzing-a-core-file-with-gdb-on-linux , that describe how to debug the core file, rather than the active process.

          People

            wlad Vladislav Vaintroub
            semprul57 Muhammad Baqir
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.