[MDEV-22404] Server Crash after Galera WSREP event received Created: 2020-04-29  Updated: 2020-10-08  Resolved: 2020-10-08

Status: Closed
Project: MariaDB Server
Component/s: Galera, Storage Engine - InnoDB
Affects Version/s: 10.3.15
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Chun-Liang Chen Assignee: Jan Lindström (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: crash, need_feedback
Environment:

Maria DB:10.3.15
Galera:25.3.26(r3857)
Red Hat Enterprise Linux Server 7.2 (Maipo)


Attachments: Text File Record overlaps another_Log.txt     File createTable.sql     Text File logWithMemDump.txt     Text File oneNodeCrashLog.txt     Text File sqlLogDumpPage.txt    
Issue Links:
Relates
relates to MDEV-19783 Random crashes and corrupt data in IN... Closed

 Description   

We hava three nodes with Galera cluster. one day moring two of them crash due to "InnoDB: Rec offset 99, cur1 offset 10327, cur2 offset 15073" signal 6 message, and the backtrace includes "/usr/sbin/mysqld(start_wsrep_THD+0x29e)[0x7f41deb3858e]".

We start to investigate releated logs and gather some information below.
1. Two DB nodes crashed on 4/23 around 8:30 am.
2. "table A"'s index is currption.
3. There are still a lot of update or select sql from "table A" have been executed before db crashed.
4. The "table A" alter add "column C" VARCHAR(10) NOT NULL DEFAULT 'Y' without assigned algorithm on 4/22 around 2:00 am.
(our alter_algorithm value is DEFAULT. according to the Mariadb online document,
"the most efficient available algorithm will usually be used". so it might be "instant")
5. We don't have the currption .ibd file. but we have the dump sql file.
6. The "column C"'s data in "table A" should be empty,
but the insert sql "column C"'s value is wired,such as "010045]9`?"、"10044]ˏs10"、010049\"?.. etc... in the dump sql file.

We would like to know what caused our db crash.
Is it a innodb engine or galera cluster's bug?

Thanks.



 Comments   
Comment by Jan Lindström (Inactive) [ 2020-10-07 ]

Current information is not enough to say is this InnoDB or Galera bug. Firstly, is the problem repeatable? If yes, you could resolve stack dump addresses using debug symbols see https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ . Secondly, you could try more recent version of MariaDB server and Galera library. If issue is still repeatable we could use resolved stack dump, full error logs from both nodes and more detailed steps to reproduce.

Comment by Chun-Liang Chen [ 2020-10-08 ]

The probolem is not repeatable after that day, and then we updated our MariaDB Server to 10.3.24 last month.
Thanks your advice, we will keep eyes on it.

Generated at Thu Feb 08 09:14:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.