Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24423

MariaDB 10.3 crashing and restarting intermittently - segfault at 0

Details

    Description

      Hello,

      We are using a regular Debian 10 server with latest MariaDB 10.3.27. It use to work nice for months, but since 1 week, we are facing some regular crashes after a few hours of run. Then applications (zabbix, etc...) loss the DB connections and some transactions are broken.

      System specs : - 4 vCPU - 10G of RAM - Disks are some LUNs on an EMC VNX

      Here is an example of the syslog messages:

      Dec 16 19:44:48 mysqlbddvprd1 kernel: [503847.749484] show_signal_msg: 18 callbacks suppressed
      Dec 16 19:44:48 mysqlbddvprd1 kernel: [503847.749487] mysqld[60145]: segfault at 0 ip 0000557197badfb3 sp 00007f2dbbe2d310 error 6 in mysqld[5571973f0000+80a000]
      Dec 16 19:44:48 mysqlbddvprd1 kernel: [503847.749491] Code: c7 45 00 00 00 00 00 8b 7d cc 4c 89 e2 4c 89 f6 e8 52 2f 84 ff 49 89 c7 49 39 c4 0f 84 06 01 00 00 e8 21 18 00 00 41 8b 4d 00 <89> 08 85 c9 74 37 49 83 ff ff 0f 84 ad 00 00 00 f6 c3 06 75 28 4d
      Dec 16 19:44:48 mysqlbddvprd1 systemd[1]: mariadb.service: Main process exited, code=killed, status=11/SEGV
      Dec 16 19:44:48 mysqlbddvprd1 systemd[1]: mariadb.service: Failed with result 'signal'.
      Dec 16 19:44:53 mysqlbddvprd1 systemd[1]: mariadb.service: Service RestartSec=5s expired, scheduling restart.
      Dec 16 19:44:53 mysqlbddvprd1 systemd[1]: mariadb.service: Scheduled restart job, restart counter is at 1.
      Dec 16 19:44:53 mysqlbddvprd1 systemd[1]: Stopped MariaDB 10.3.27 database server.
       
      Dec 16 19:44:53 mysqlbddvprd1 systemd[1]: Starting MariaDB 10.3.27 database server...
      Dec 16 19:44:53 mysqlbddvprd1 mysqld[43693]: 2020-12-16 19:44:53 0 [Note] /usr/sbin/mysqld (mysqld 10.3.27-MariaDB-0+deb10u1) starting as process 43693 ...
      Dec 16 19:45:00 mysqlbddvprd1 systemd[1]: Started MariaDB 10.3.27 database server.
       
      Dec 16 19:45:00 mysqlbddvprd1 /etc/mysql/debian-start[43750]: Upgrading MySQL tables if necessary.
      Dec 16 19:45:00 mysqlbddvprd1 /etc/mysql/debian-start[43753]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
      Dec 16 19:45:00 mysqlbddvprd1 /etc/mysql/debian-start[43753]: Looking for 'mysql' as: /usr/bin/mysql
      Dec 16 19:45:00 mysqlbddvprd1 /etc/mysql/debian-start[43753]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
      Dec 16 19:45:00 mysqlbddvprd1 /etc/mysql/debian-start[43753]: This installation of MySQL is already upgraded to 10.3.27-MariaDB, use --force if you still need to run mysql_upgrade
      Dec 16 19:45:00 mysqlbddvprd1 /etc/mysql/debian-start[43765]: Checking for insecure root accounts.
      Dec 16 19:45:00 mysqlbddvprd1 /etc/mysql/debian-start[43769]: Triggering myisam-recover for all MyISAM tables and aria-recover for all Aria tables
      

      And here is a part of the conf file we use: /etc/mysql/mariadb.conf.d/50-server.cnf

      #
      # * Fine Tuning
      #
      myisam_recover_options  = BACKUP
      max_connections         = 150
       
      #
      # * Fine Tuning for InnoDB
      #
      innodb_buffer_pool_size = 7G            # Go up to 70% to 80% of your available RAM
      innodb_buffer_pool_instances = 4        # Bigger if huge InnoDB Buffer Pool or high concurrency
       
      innodb_file_per_table   = 1             # Is the recommended way nowadays
      innodb_flush_method     = O_DIRECT
      innodb_write_io_threads = 8             # If you have a strong I/O system or SSD
      innodb_read_io_threads  = 8             # If you have a strong I/O system or SSD
      innodb_io_capacity      = 1000          # If you have a strong I/O system or SSD
       
      innodb_flush_log_at_trx_commit = 1      # 1 for durability, 0 or 2 for performance
      innodb_log_buffer_size  = 8M            # Bigger if innodb_flush_log_at_trx_commit = 0
      innodb_log_file_size    = 128M          # Bigger means more write throughput but longer recovery time
       
      #
      # * Query Cache Configuration
      #
      query_cache_type        = 0
      query_cache_size        = 0
      

      Error.log files are linked.
      Any comments are welcome.

      Best regards,

      Attachments

        1. error.log
          3.63 MB
        2. error.log.1
          3.77 MB

        Activity

          Long semaphore wait crash

          elenst Elena Stepanova added a comment - Long semaphore wait crash
          Greg1258 Greg1258 added a comment - - edited

          Hello,

          We have a the same error on MariaDB 10.2.36 (crash in a long transaction).
          To be more precise, our transaction which cause the server to crash contains a single query, an "INSERT multiple" (hundreds rows).

          Seems linked to : https://jira.mariadb.org/browse/MDEV-24375

          Could it be also linked to this performance regression : https://jira.mariadb.org/browse/MDEV-24272 ?

          Greg1258 Greg1258 added a comment - - edited Hello, We have a the same error on MariaDB 10.2.36 (crash in a long transaction). To be more precise, our transaction which cause the server to crash contains a single query, an "INSERT multiple" (hundreds rows). Seems linked to : https://jira.mariadb.org/browse/MDEV-24375 Could it be also linked to this performance regression : https://jira.mariadb.org/browse/MDEV-24272 ?
          Nevermind D (Inactive) added a comment -

          Hello,

          have the same problems on multiple systems, long semaphore wait, crashes at various repeating intervals.
          Also the "error 6" and the segfault of mysqld is happening at the DMESG - sometimes, but not everywhere.

          Seems like the last two minor versions (both November 2020 releases) are affected:

          • 10.5.7, 10.5.8 = affected
          • 10.3.26, 10.3.27 = affected
            Workaround, downgrade down two (pre-November 2020) versions.
            The 10.5.6 or 10.3.25 releases seems to have none of these problems.
            Similarly, also the other versions of 10.1, 10.2, 10.4 could be affected too, but don't have any of these.
          Nevermind D (Inactive) added a comment - Hello, have the same problems on multiple systems, long semaphore wait, crashes at various repeating intervals. Also the "error 6" and the segfault of mysqld is happening at the DMESG - sometimes, but not everywhere. Seems like the last two minor versions (both November 2020 releases) are affected: 10.5.7, 10.5.8 = affected 10.3.26, 10.3.27 = affected Workaround, downgrade down two (pre-November 2020) versions. The 10.5.6 or 10.3.25 releases seems to have none of these problems. Similarly, also the other versions of 10.1, 10.2, 10.4 could be affected too, but don't have any of these.

          Hello, I got the same problems on two machines. Both have CentOS8 (8.3.1-5) and 10.3.27 MariaDB.

          Maria crashes every day at the same hour.

          [Warning] InnoDB: A long semaphore wait:

          In fact problems started when MariaDB was upgraded to version 10.3.27.

          Sara

          sara.artiglieri@tech2.it Sara Artiglieri added a comment - Hello, I got the same problems on two machines. Both have CentOS8 (8.3.1-5) and 10.3.27 MariaDB. Maria crashes every day at the same hour. [Warning] InnoDB: A long semaphore wait: In fact problems started when MariaDB was upgraded to version 10.3.27. Sara
          Nevermind D (Inactive) added a comment -

          The most recent update "Release date: 22 Feb 2021" - in my case the v10.5.9, seems to have fixed the aforementioned problems.
          So far no repeating crashes or similar problems like before.

          Nevermind D (Inactive) added a comment - The most recent update "Release date: 22 Feb 2021" - in my case the v10.5.9, seems to have fixed the aforementioned problems. So far no repeating crashes or similar problems like before.

          Can anyone enable core dumps or attach a debugger to a hung server, to produce fully resolved stack traces of all threads during the hang? Without such output, it is impossible to diagnose hangs.

          In MariaDB Server 10.6, the "long semaphore wait" diagnostics was replaced with a simple watchdog on dict_sys.latch.

          marko Marko Mäkelä added a comment - Can anyone enable core dumps or attach a debugger to a hung server, to produce fully resolved stack traces of all threads during the hang? Without such output, it is impossible to diagnose hangs. In MariaDB Server 10.6, the "long semaphore wait" diagnostics was replaced with a simple watchdog on dict_sys.latch .

          People

            marko Marko Mäkelä
            sbocquet Stéphane BOCQUET
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.