[MDEV-12647] Galera + LOCK TABLES deadlock Created: 2017-04-30  Updated: 2017-04-30  Resolved: 2017-04-30

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: None
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Andrew Garner Assignee: Unassigned
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

MariaDB 10.1.18 / Galera 25.3.17


Attachments: Text File galera_bug_gdb_backtrace.txt    

 Description   

With the following sequence of commands, node2 always gets stuck and all writes to this node hang.

Node1:

mariadb[node1]> CREATE TABLE t1 (id INT PRIMARY KEY);
Query OK, 0 rows affected (0.02 sec)
 
mariadb[node1]> LOCK TABLES t1 WRITE;
Query OK, 0 rows affected (0.00 sec)

Node2:

mariadb[node2]> LOCK TABLE t1 WRITE;
Query OK, 0 rows affected (0.00 sec)
 
mariadb[node2]> SELECT * FROM t1;
Empty set (0.00 sec)

Node1:

mariadb[node1]> INSERT INTO t1 VALUES (1);
Query OK, 1 row affected (0.01 sec)

Node2:

mariadb[node2]> INSERT INTO t1 VALUES (2);
-- ^ never returns 

Node1:

mariadb[node1]> UNLOCK TABLES;
Query OK, 0 rows affected (0.00 sec)
 
mariadb[node1]> SELECT * FROM t1;
+----+
| id |
+----+
|  1 |
|  2 |
+----+
2 rows in set (0.00 sec)

At this point, any writes on node2 will hang and even after UNLOCK TABLES on node1, the INSERT on node2 remains in a hung state. The connection on node2 holding the table lock cannot be terminated through KILL commands. I've attached gdb "thread apply all bt" output, in case it is useful.

I do see in my error log on node2 that an abort was attempted:

[Note] WSREP: MDL conflict db=foo table=t1 ticket=7 solved by abort 

That behavior I do expect, but it did not seem to successfully unstick this particular case. Also worth mentioning is that if I continue to write to the other cluster nodes, wsrep_local_recv_queue rises on node2 - I was expecting flow control to kick in at some point given that this node state is reported as "Synced", and is using defaults (i.e. gcs.fc_limit=16).

Also reproduced under MariaDB 10.1.22 (and tested w/ latest galera 25.3.20) , but the attached logs are from an older MariaDB 10.1.18 (galera 25.3.17) environment.



 Comments   
Comment by Daniel Black [ 2017-04-30 ]

Unsupported as per https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/

Generated at Thu Feb 08 07:59:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.