[MDEV-8223] MariaDB Galera cluster crashed Created: 2015-05-25  Updated: 2018-07-16  Resolved: 2018-07-16

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.0.14-galera
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Elias Abacioglu Assignee: Jan Lindström (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: galera, replication
Environment:

Ubuntu 12.04, 3 node cluster.
MariaDB Galera Cluster 10.0.14


Attachments: Text File wssql-ix01.log     Text File wssql-ix02.log     Text File wssql-ix03.log    
Issue Links:
Relates
relates to MDEV-8744 galera nodes terminate on duplicate k... Closed

 Description   

Hi,

My galera cluster crashed. It's a 3 node cluster.
The flow is like this:
First node01 died and tried to restart a couple of times, after that node03 died.
node02 was alone in the cluster and didn't really work cause I guess it didn't have quorum.
node01 and node03 couldn't start and join the cluster.
So then I killed node02 so I could start the cluster fresh.
To make this even more complicated, I have a service supervisor that tries to start the service when it's dead(until i stopped the supervisor). So the logs got a bit cluttered from all the start attempts.
Before I killed node02 I made sure the service supervisor didn't try to start the other nodes.
After I killed node02, I tried a normal start, that failed.
Then finally I started node02 with --wsrep-new-cluster, then it started, after that I started node03, after that node01.

I have attached the logs, and in the logs there are some errors related to table somedb_staging_api.bookings. Please note that that table was at that time a MyISAM table.
And we have myisam replication turned off.

Is the myisam table the reason for the crash or something else?



 Comments   
Comment by Jan Lindström (Inactive) [ 2017-04-10 ]

At least in your node one there is severe database corruption:

May 25 11:40:10 wssql-ix01 mysqld: 2015-05-25 11:40:10 7f6e2182c780 InnoDB: Error: page 6 log sequence number 35648491110
May 25 11:40:10 wssql-ix01 mysqld: InnoDB: is in the future! Current system log sequence number 35283240302.
May 25 11:40:10 wssql-ix01 mysqld: InnoDB: Your database may be corrupt or you may have copied the InnoDB
May 25 11:40:10 wssql-ix01 mysqld: InnoDB: tablespace but not the InnoDB log files. See
May 25 11:40:10 wssql-ix01 mysqld: InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
May 25 11:40:10 wssql-ix01 mysqld: InnoDB: for more information.

Do you have disk failure on that node? You could try to delete everything on that node and bootstrap that node again.

Comment by Elias Abacioglu [ 2017-05-16 ]

Well I did bootstrap the broken nodes again two years ago.
The reason I reported the issue was to find out why this might have happened and how to avoid it.
There was no disk failure at that time. But we haven't had the same problem since 2 years ago, so I'm resolving the issue.

Comment by Elias Abacioglu [ 2017-05-16 ]

Don't know how to resolve it, can someone do it for me?

Comment by Jan Lindström (Inactive) [ 2018-07-16 ]

I would recommend upgrading the server to more recent version and bootstrap the effected node again from scratch.

Generated at Thu Feb 08 07:25:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.