[MDEV-13996] DELETE of just hundreds of thousands of rows on a table on Node 1 blocks *any* writes on Node 2,3,... for 20s+ (Depending on power) Created: 2017-10-04  Updated: 2020-08-25  Resolved: 2017-10-26

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.2.9
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Claudio Nanni Assignee: Andrii Nikitin (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

Linux


Attachments: File simplified.sql.gz     File testcase_scr.sql.gz    

 Description   

Deleting records on a table blocks all other nodes from writing to any database or table after the DELETE finished on the node for a time that is proportional to the number of records.
On a i3 4 cores, SSD, 12GB box , deleting just 500.000 records cause the other nodes to be blocked for 30s+
Query status depend on the type of query, INSERT will be stuck in "Query end", DDL will be stuck in "checking permissions":

3526	root	localhost:39454	mysqlslap	Query	40	query end	INSERT INTO t1 VALUES (1946710049,599529154,'hmu3lkoxjtvgLu5xKHSTTtJuGE5F5QqmCcppCTmvFZScRZQgim93gSx	0.0003

OR

3365	root	localhost:38408	NULL	Query	31	checking permissions	DROP SCHEMA IF EXISTS `mysqlslap`	0

To reproduce:

NODE 1:

Import attached data in db1

use db1;
# Drop or not the FK constraints, it does not change anything
DELETE FROM IT9a932d5519;
#Wait for it to finish

Now on NODE 2 or 3: (this does not happen on NODE 1)

use test;
CREATE TABLE p (id int);
INSERT INTO p VALUES (123);

Either will be blocked for long time, if you already created t1, it will be the INSERT.

Bonus problem
In some cases I have got:

MariaDB [test]> SELECT NOW(); insert into p values (122); SELECT NOW();
+---------------------+
| NOW()               |
+---------------------+
| 2017-10-03 14:09:55 |
+---------------------+
1 row in set (0.00 sec)
 
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
+---------------------+
| NOW()               |
+---------------------+
| 2017-10-03 14:10:42 |
+---------------------+
1 row in set (0.00 sec)

In such case another simple INSERT was happening on another node on the same table.

MariaDB [test]> show create table p\G
*************************** 1. row ***************************
       Table: p
Create Table: CREATE TABLE `p` (
  `id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)



 Comments   
Comment by Andrii Nikitin (Inactive) [ 2017-10-26 ]

Closing with 'Not a bug' because user manual https://mariadb.com/kb/en/library/mariadb-galera-cluster-known-limitations/ explicitly mentions "Do not use transactions of any essential size" .

While the reason behind that particular entry is memory usage - following quote reveals that all transactions on every Node will be blocked until every node have certified incoming (huge) transaction:
http://galeracluster.com/documentation-webpages/certificationbasedreplication.html
"
When the client issues a COMMIT command, but before the actual commit occurs, all changes made to the database by the transaction and primary keys of the changed rows are collected into a write-set. The database then sends this write-set to all the other nodes.
The write-set then undergoes a deterministic certification test, using the primary keys. This is done on each node in the cluster, including the node that originates the write-set. It determines whether or not the node can apply the write-set.
"
And later:
"
Galera Cluster assigns each transaction a global ordinal sequence number, or seqno, during replication. When a transaction reaches the commit point, the node checks the sequence number against that of the last successful transaction. The interval between the two is the area of concern, given that transactions that occur within this interval have not seen the effects of each other. All transactions in this interval are checked for primary key conflicts with the transaction in question.
"

Which means that in the Ticket Description all nodes by design will be blocked while Galera is transferring 500.000 pk values to each node and performing certification of that huge transaction.

Comment by Andrii Nikitin (Inactive) [ 2017-10-26 ]

Workaround should be using TRUNCATE or deleting in smaller batches, e.g. loop with:
DELETE FROM IT9a932d5519 LIMIT 10000;

Generated at Thu Feb 08 08:10:06 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.