[MDEV-17883] CREATE TABLE IF NOT EXISTS locking changes in 10.3.10 Created: 2018-11-30  Updated: 2019-05-08  Resolved: 2019-05-03

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.2.18, 10.3.10, 10.1.37
Fix Version/s: 10.2.24, 10.1.41, 10.3.15, 10.4.5

Type: Bug Priority: Critical
Reporter: Chris Wilson Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Linux



 Description   

I have a workload that does partitioning manually (largely because at the end of each day I want to 'optimize table' for the previous day's partition to address fragmentation). As a result there's a stored procedure, that prepares a statement of the form "create table if not exists table_20181130 like table_template" and runs it. This procedure is called rather a lot and >99% of the time the table already exists so this doesn't do anything. This has been working great up-to and including MariaDB 10.3.9.

Upon upgrading to 10.3.10 (or 11) I find that performance of my system is abysmal. And this seems to be down to this procedure - the create table if not exists doesn't return immediately like it used to if the table exists and any other query has the table open. I can work around this by looking in information_schema.TABLES and avoiding the 'create table if not exists' if the table already exists - however this feels like a regression to me rather than an intentional change, especially as I don't see anything obviously related in the 10.3.10 ChangeLog.

To recreate; in one terminal execute:

create table if not exists test1 (a int);
select sum(a), sleep(60) from test1;

... and in another:

create table if not exists test1 (a int);

In anything before 10.3.10 the second terminal will complete immediately, but in 10.3.10 (or 11) the second terminal will only complete once the select in the first terminal has done so.



 Comments   
Comment by Chris Wilson [ 2018-11-30 ]

Additionally: this doesn't seem to be specific to a particular storage engine.

Comment by Elena Stepanova [ 2018-12-07 ]

Thanks for the report.
The change was introduced by this merge:
https://github.com/MariaDB/server/commit/4d93fea4e0b8fa602c90b9afe6d4c67f09678d74

Particularly by this change:

diff --git a/sql/sql_base.cc b/sql/sql_base.cc
index 15cfba80340f..34aef26ebc21 100644
--- a/sql/sql_base.cc
+++ b/sql/sql_base.cc
@@ -3813,6 +3813,10 @@ lock_table_names(THD *thd, const DDL_options_st &options,
     mdl_requests.push_front(&global_request);
 
     if (create_table)
+    #ifdef WITH_WSREP
+      if (thd->lex->sql_command != SQLCOM_CREATE_TABLE && 
+          thd->wsrep_exec_mode != REPL_RECV)
+    #endif
       lock_wait_timeout= 0;                     // Don't wait for timeout
   }

Here is a test for reproducing purposes only. It needs to be improved before adding it to the regression suite, and we also need to find out why existing tests didn't catch it.

create table if not exists test1 (a int);
send select sum(a), sleep(5) from test1;
 
--connect (con1,localhost,root,,)
send create table if not exists test1 (a int);
 
--connect (con2,localhost,root,,)
sleep 1;
let $wait_timeout= 2;
let $wait_for_all= 1;
let $show_statement= SHOW PROCESSLIST;
let $field= State;
let $condition= != 'Waiting for table metadata lock';
--source include/wait_show_condition.inc
 
# Cleanup
--disconnect con2
--connection con1
--reap
--disconnect con1
--connection default
--reap
drop table test1;

The test passes either way, but on a good version it passes with the expected output

Expected result

create table if not exists test1 (a int);
select sum(a), sleep(5) from test1;
connect  con1,localhost,root,,;
create table if not exists test1 (a int);
connect  con2,localhost,root,,;
disconnect con2;
connection con1;
Warnings:
Note	1050	Table 'test1' already exists
disconnect con1;
connection default;
sum(a)	sleep(5)
NULL	0
drop table test1;

Please keep in mind that the test is not 100% reliable, it can create false negatives. Don't trust the first pass.

while on a broken version it also produces a wait timeout and hence mismatch:

 connect  con1,localhost,root,,;
 create table if not exists test1 (a int);
 connect  con2,localhost,root,,;
+# Timeout in include/wait_show_condition.inc for != 'Waiting for table metadata lock'
+#         show_statement : SHOW PROCESSLIST
+#         field          : State
+#         condition      : != 'Waiting for table metadata lock'
+#         max_run_time   : 3
+SHOW PROCESSLIST;
+Id	User	Host	db	Command	Time	State	Info	Progress
+4	root	localhost	test	Query	4	User sleep	select sum(a), sleep(5) from test1	0.000
+5	root	localhost	test	Query	4	Waiting for table metadata lock	create table if not exists test1 (a int)	0.000
+6	root	localhost	test	Query	0	Init	SHOW PROCESSLIST	0.000
 disconnect con2;
 connection con1;
 Warnings:

For 10.1 / 10.2 it was done by a separate commit:
https://github.com/MariaDB/server/commit/16384fae6

Comment by Mario Karuza (Inactive) [ 2019-04-26 ]

Fix is done in PR associated with this ticket. Please retest.

Comment by Jan Lindström (Inactive) [ 2019-05-03 ]

Failed to find reliable test case.

Generated at Thu Feb 08 08:39:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.