[MDEV-7610] MariaDB Galera Cluster Table locking issue after mysqldump execution Created: 2015-02-19  Updated: 2020-06-02  Resolved: 2020-06-02

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 5.5.41-galera
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Tobias Kleinke Assignee: Jan Lindström (Inactive)
Resolution: Won't Fix Votes: 0
Labels: galera, mariadb, mysqldump
Environment:

CentOS release 6.5 2.6.32-431.3.1.el6.x86_64
VM, 4GB RAM
MariaDB Galera Cluster consists of 3 nodes



 Description   

We updated MariaDB Galera Version to 5.5.41 on 19th of January.
We start a daily mysqldump on 21:00 CET. This worked for ~ 1 year without any issues.
Last night (18th of FEB), the following issue happend at 21:00 (9 p.m.):

  • mysqldump starts on Node 1 with the following parameters:
    mysqldump -uroot -p --create-options {DATABASE_NAME}

    -f

  • simultaneously, the following query did not complete on Node 1 and remains with the state "sql end" on the server:
    INSERT INTO ` {DBNAME}`.`{TABLE_NAME}` (isin, date, count )\n VALUES ('XXXX12345', CURDATE() , 1 )\n ON DUPLICATE KEY UPDATE count=count+1

    * full mysqldump on Node 1 completed with a duration of only a minute without any issues like every day, dumps are valid and complete

    * new incoming connections on Node 1 try to access the same table, but are waiting in state "Waiting for table level lock"
    INSERT INTO `{DBNAME}

    `.`

    {TABLE_NAME}

    ` (isin, date, count )\n VALUES ('YYYY67890', CURDATE() , 1 )\n ON DUPLICATE KEY UPDATE count=count+1

  • INSERT/UPDATE statements on the same table on Node 2/3 do also not complete. Sessions are haning with similar statements in states "update" and "query end", first sessions appeared on 21:00
  • Number of connections increases on all 3 nodes, since all nodes are waiting for first INSERT statement on Node 1 to complete, therefore replication stopped.
  • Nodes 2 and 3 ran out of free connections and resources at ~ 21:10
  • INSERT statement on Node 1 could not be killed, whole Cluster needed to be shut down and restarted
  • Errorlogs do not show any issues until the high amount of connections leads to service unavailability
  • Nodes did not write any Core Dumps
  • This Incident is (hopefully) not reproducable, as already mentioned the cluster ran wihtout any issues about a year, the mysqldump is made every day at the same time.


 Comments   
Comment by Elena Stepanova [ 2015-02-19 ]

Assigning to nirbhay_c to see if it rings a bell.

Comment by Nirbhay Choubey (Inactive) [ 2015-05-07 ]

I tried to reproduce this issue by performing a dump with records being inserted simultaneously. I found no issues whatsoever.
Would you mind sharing the table structures? Also, are there myisam tables too?

Comment by Tobias Kleinke [ 2015-05-07 ]

Hi Nirbhay,
thanks for your response.

I am afraid a cannot provide detailed meta-data information here, but this is what I can tell you:

  • database has 83 tables, 78 are InnoDB, 5 are myISAM (we set "wsrep_replicate_myisam = 1" in my.cnf - I know this is still experimental...)
  • locked table was an InnoDB table and has about 2 million rows (data length: ~ 140 MB) and is the biggest table of the database
  • locked table has 3 columns with the following types: varchar(15), int(11), date / primary key is: varchar(15), date

Regards,
Tobias

Comment by Ralf Gebhardt [ 2020-06-02 ]

Unsupported version. Please reopen the issue if you face the same issue with MariaDB Server 10.1+

Generated at Thu Feb 08 07:20:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.