Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22404

Server Crash after Galera WSREP event received

Details

    Description

      We hava three nodes with Galera cluster. one day moring two of them crash due to "InnoDB: Rec offset 99, cur1 offset 10327, cur2 offset 15073" signal 6 message, and the backtrace includes "/usr/sbin/mysqld(start_wsrep_THD+0x29e)[0x7f41deb3858e]".

      We start to investigate releated logs and gather some information below.
      1. Two DB nodes crashed on 4/23 around 8:30 am.
      2. "table A"'s index is currption.
      3. There are still a lot of update or select sql from "table A" have been executed before db crashed.
      4. The "table A" alter add "column C" VARCHAR(10) NOT NULL DEFAULT 'Y' without assigned algorithm on 4/22 around 2:00 am.
      (our alter_algorithm value is DEFAULT. according to the Mariadb online document,
      "the most efficient available algorithm will usually be used". so it might be "instant")
      5. We don't have the currption .ibd file. but we have the dump sql file.
      6. The "column C"'s data in "table A" should be empty,
      but the insert sql "column C"'s value is wired,such as "010045]9`?"、"10044]ˏs10"、010049\"?.. etc... in the dump sql file.

      We would like to know what caused our db crash.
      Is it a innodb engine or galera cluster's bug?

      Thanks.

      Attachments

        1. createTable.sql
          7 kB
        2. logWithMemDump.txt
          104 kB
        3. oneNodeCrashLog.txt
          2 kB
        4. Record overlaps another_Log.txt
          3 kB
        5. sqlLogDumpPage.txt
          51 kB

        Issue Links

          Activity

            Current information is not enough to say is this InnoDB or Galera bug. Firstly, is the problem repeatable? If yes, you could resolve stack dump addresses using debug symbols see https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ . Secondly, you could try more recent version of MariaDB server and Galera library. If issue is still repeatable we could use resolved stack dump, full error logs from both nodes and more detailed steps to reproduce.

            jplindst Jan Lindström (Inactive) added a comment - Current information is not enough to say is this InnoDB or Galera bug. Firstly, is the problem repeatable? If yes, you could resolve stack dump addresses using debug symbols see https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ . Secondly, you could try more recent version of MariaDB server and Galera library. If issue is still repeatable we could use resolved stack dump, full error logs from both nodes and more detailed steps to reproduce.

            The probolem is not repeatable after that day, and then we updated our MariaDB Server to 10.3.24 last month.
            Thanks your advice, we will keep eyes on it.

            falcon1631 Chun-Liang Chen added a comment - The probolem is not repeatable after that day, and then we updated our MariaDB Server to 10.3.24 last month. Thanks your advice, we will keep eyes on it.

            People

              jplindst Jan Lindström (Inactive)
              falcon1631 Chun-Liang Chen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.