[MDEV-11012] Galera Cluster died after issuing Alter on a table - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Affects Version/s: 10.0.24-galera
Fix Version/s: N/A
Component/s: Galera, Replication, wsrep
Labels:
- galera
- replication
Environment:
CentOS Linux release 7.2.1511 (Core)
Linux 100-103-10-310-db 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Description

We have a Galera cluster 10.0.24 running with 3 nodes. We got a requirement from application team to add a new column to one of the table that is actively being used. This table had 100k rows and we directly issued alter statement on the primary node. Alter statement ran for 2-3 mins and failed with a duplicate key error and the remaining two nodes died by throwing the following error in their error logs.
161005 19:06:57 [ERROR] Slave SQL: Column 34 of table 'mps.pages' cannot be converted from type 'tinyblob' to type 'varchar(2
55)', Internal MariaDB error code: 1677
161005 19:06:57 [Warning] WSREP: RBR event 2 Update_rows_v1 apply warning: 3, 32400488
161005 19:06:58 [ERROR] Slave SQL: Column 34 of table 'mps.pages' cannot be converted from type 'tinyblob' to type 'varchar(2
55)', Internal MariaDB error code: 1677
161005 19:06:58 [Warning] WSREP: RBR event 2 Write_rows_v1 apply warning: 3, 32400489
161005 19:06:58 [Warning] WSREP: Failed to apply app buffer: seqno: 32400488, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 2th time

After 4 retrys,

161005 19:06:58 [ERROR] WSREP: Failed to apply trx 32400492 4 times
161005 19:06:58 [ERROR] WSREP: Node consistency compromized, aborting...
161005 19:06:58 [Note] WSREP: Closing send monitor...
161005 19:06:58 [Note] WSREP: Closed send monitor.
161005 19:06:58 [Note] WSREP: /usr/sbin/mysqld: Terminated.
161005 19:06:58 [Note] WSREP: /usr/sbin/mysqld: Terminated.
161005 19:06:58 [Note] WSREP: /usr/sbin/mysqld: Terminated.
161005 19:07:02 mysqld_safe Number of processes running now: 0
161005 19:07:02 mysqld_safe WSREP: not restarting wsrep node automatically
161005 19:07:02 mysqld_safe mysqld from pid file /mysql/prod_ng_misc3/mysqld.pid ended

Attached is the my.cnf and error log on the 3rd node which died first. Same error message was recorded on the second node which died as well. Due to this, first node which is used by application became non-primary and stopped accepting writes.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

error-log.err.bkp
11 kB
2016-10-10 13:44
my.cnf
8 kB
2016-10-10 13:44

Activity

People

Assignee:: Jan Lindström (Inactive)

Reporter:: Harikrishna Yadlapalli

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2016-10-10 13:45

Updated:: 2019-12-12 12:08

Resolved:: 2019-12-12 12:08

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.