Cluster keeps crashing after random time period of time. Its been there for a while but now we added a third node and the problem increased. I am now at the point that there is something wrong with the MariaDB service as the crash report understates this.
The table which is likely causing the crash is using InnoDB. Structure:
id int(10) UN AI PK
parent_id int(10) UN
post_user_id int(10) UN
source_user_id int(10) UN
image_id int(10) UN
type_id int(10) UN
alert_id int(10) UN
can_comment tinyint(3) UN
views int(10) UN
movie_id int(10) UN
imdb_id int(10) UN
event_id int(10) UN
According to the error message:
2021-08-21 14:27:19 1 [Note] WSREP: Victim thread:
THD: 132951, mode: local, state: committing, conflict: no conflict, seqno: -1
SQL: UPDATE `news` SET `views`=25499 WHERE `id`=83796
2021-08-21 14:27:19 0 [ERROR] WSREP: invalid state ROLLED_BACK (FATAL)
2021-08-21 14:27:19 0 [ERROR] WSREP: cancel commit bad exit: 7 514275587
210821 14:27:19 [ERROR] mysqld got signal 6 ;
The 'views' are updates with Queued Jobs and therefor its impossible that this update event is executed by multiple instances by one user within x seconds.
All nodes are clones of 1 master image, meaning; software wise they are the exact same VM's. I attached the logs of the other 2 nodes at the time of the crash.
All connections to the DB are handled by Galera Load Balancer. At first this helped al lot, but now with new node added, the problem returned.