[MDEV-7610] MariaDB Galera Cluster Table locking issue after mysqldump execution - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Affects Version/s: 5.5.41-galera
Fix Version/s: N/A
Component/s: Galera
Labels:
- galera
- mariadb
- mysqldump
Environment:
CentOS release 6.5 2.6.32-431.3.1.el6.x86_64
VM, 4GB RAM
MariaDB Galera Cluster consists of 3 nodes

Description

We updated MariaDB Galera Version to 5.5.41 on 19th of January.
We start a daily mysqldump on 21:00 CET. This worked for ~ 1 year without any issues.
Last night (18th of FEB), the following issue happend at 21:00 (9 p.m.):

mysqldump starts on Node 1 with the following parameters:
mysqldump -uroot -p --create-options {DATABASE_NAME}
-f

simultaneously, the following query did not complete on Node 1 and remains with the state "sql end" on the server:
INSERT INTO ` {DBNAME}`.`{TABLE_NAME}` (isin, date, count )\n VALUES ('XXXX12345', CURDATE() , 1 )\n ON DUPLICATE KEY UPDATE count=count+1

* full mysqldump on Node 1 completed with a duration of only a minute without any issues like every day, dumps are valid and complete

* new incoming connections on Node 1 try to access the same table, but are waiting in state "Waiting for table level lock"
INSERT INTO `{DBNAME}
`.`
{TABLE_NAME}
` (isin, date, count )\n VALUES ('YYYY67890', CURDATE() , 1 )\n ON DUPLICATE KEY UPDATE count=count+1

INSERT/UPDATE statements on the same table on Node 2/3 do also not complete. Sessions are haning with similar statements in states "update" and "query end", first sessions appeared on 21:00

Number of connections increases on all 3 nodes, since all nodes are waiting for first INSERT statement on Node 1 to complete, therefore replication stopped.

Nodes 2 and 3 ran out of free connections and resources at ~ 21:10

INSERT statement on Node 1 could not be killed, whole Cluster needed to be shut down and restarted

Errorlogs do not show any issues until the high amount of connections leads to service unavailability

Nodes did not write any Core Dumps

This Incident is (hopefully) not reproducable, as already mentioned the cluster ran wihtout any issues about a year, the mysqldump is made every day at the same time.

Attachments

Activity

Ascending order - Click to sort in descending order

Elena Stepanova added a comment - 2015-02-19 14:34

Assigning to nirbhay_c to see if it rings a bell.

Elena Stepanova added a comment - 2015-02-19 14:34 Assigning to nirbhay_c to see if it rings a bell.

Nirbhay Choubey (Inactive) added a comment - 2015-05-07 05:06

I tried to reproduce this issue by performing a dump with records being inserted simultaneously. I found no issues whatsoever.
Would you mind sharing the table structures? Also, are there myisam tables too?

Nirbhay Choubey (Inactive) added a comment - 2015-05-07 05:06 I tried to reproduce this issue by performing a dump with records being inserted simultaneously. I found no issues whatsoever. Would you mind sharing the table structures? Also, are there myisam tables too?

Tobias Kleinke added a comment - 2015-05-07 13:46

Hi Nirbhay,
thanks for your response.

I am afraid a cannot provide detailed meta-data information here, but this is what I can tell you:

database has 83 tables, 78 are InnoDB, 5 are myISAM (we set "wsrep_replicate_myisam = 1" in my.cnf - I know this is still experimental...)
locked table was an InnoDB table and has about 2 million rows (data length: ~ 140 MB) and is the biggest table of the database
locked table has 3 columns with the following types: varchar(15), int(11), date / primary key is: varchar(15), date

Regards,
Tobias

Tobias Kleinke added a comment - 2015-05-07 13:46 Hi Nirbhay, thanks for your response. I am afraid a cannot provide detailed meta-data information here, but this is what I can tell you: database has 83 tables, 78 are InnoDB, 5 are myISAM (we set "wsrep_replicate_myisam = 1" in my.cnf - I know this is still experimental...) locked table was an InnoDB table and has about 2 million rows (data length: ~ 140 MB) and is the biggest table of the database locked table has 3 columns with the following types: varchar(15), int(11), date / primary key is: varchar(15), date Regards, Tobias

Ralf Gebhardt added a comment - 2020-06-02 06:28

Unsupported version. Please reopen the issue if you face the same issue with MariaDB Server 10.1+

Ralf Gebhardt added a comment - 2020-06-02 06:28 Unsupported version. Please reopen the issue if you face the same issue with MariaDB Server 10.1+

People

Assignee:: Jan Lindström (Inactive)

Reporter:: Tobias Kleinke

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2015-02-19 14:29

Updated:: 2020-06-02 06:28

Resolved:: 2020-06-02 06:28

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.