[MDEV-24382] Assertion `thd->mdl_context.is_lock_owner(MDL_key::TABLE, db, table_name, MDL_EXCLUSIVE)' failed in tdc_remove_table Created: 2020-12-09  Updated: 2021-05-03

Status: Stalled
Project: MariaDB Server
Component/s: Locking, Sequences
Affects Version/s: 10.5, 10.6
Fix Version/s: 10.5

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Michael Widenius
Resolution: Unresolved Votes: 0
Labels: not-10.1, not-10.2, regression

Issue Links:
Relates
relates to MDEV-19273 Server crash in MDL_ticket::has_stron... Closed

 Description   

Similar stack to MDEV-19273, and same setup (DROP SEQUENCE <non-sequence>), with this bug present seemingly in 10.5 and 10.6 only.

CREATE TABLE t (c INT);
LOCK TABLES t READ LOCAL;
CREATE TEMPORARY TABLE t (a INT) SELECT 1 AS a;
DROP SEQUENCE t;

Leads to:

10.6.0 5d4599f9750140f92cfdbbe4d292ae1b8dd456f8 (Debug)

mysqld: /test/10.6_dbg/sql/table_cache.cc:1030: void tdc_remove_table(THD*, const char*, const char*): Assertion `thd->mdl_context.is_lock_owner(MDL_key::TABLE, db, table_name, MDL_EXCLUSIVE)' failed.

10.6.0 5d4599f9750140f92cfdbbe4d292ae1b8dd456f8 (Debug)

Core was generated by `/test/MD211020-mariadb-10.6.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
[Current thread is 1 (Thread 0x14d6a815d700 (LWP 2018263))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x000055b1cfcb6021 in my_write_core (sig=sig@entry=6) at /test/10.6_dbg/mysys/stacktrace.c:424
#2  0x000055b1cf3fd321 in handle_fatal_signal (sig=6) at /test/10.6_dbg/sql/signal_handler.cc:330
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x000014d6b27c7859 in __GI_abort () at abort.c:79
#6  0x000014d6b27c7729 in __assert_fail_base (fmt=0x14d6b295d588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55b1cfe64080 "thd->mdl_context.is_lock_owner(MDL_key::TABLE, db, table_name, MDL_EXCLUSIVE)", file=0x55b1cfe76538 "/test/10.6_dbg/sql/table_cache.cc", line=1030, function=<optimized out>) at assert.c:92
#7  0x000014d6b27d8f36 in __GI___assert_fail (assertion=assertion@entry=0x55b1cfe64080 "thd->mdl_context.is_lock_owner(MDL_key::TABLE, db, table_name, MDL_EXCLUSIVE)", file=file@entry=0x55b1cfe76538 "/test/10.6_dbg/sql/table_cache.cc", line=line@entry=1030, function=function@entry=0x55b1cfe76a90 "void tdc_remove_table(THD*, const char*, const char*)") at assert.c:101
#8  0x000055b1cf35e55b in tdc_remove_table (thd=thd@entry=0x14d658000db8, db=0x14d658012df8 "test", table_name=0x14d6580126f0 "t") at /test/10.6_dbg/sql/table_cache.cc:1030
#9  0x000055b1cf1fdf41 in mysql_rm_table_no_locks (thd=thd@entry=0x14d658000db8, tables=tables@entry=0x14d658012728, if_exists=if_exists@entry=false, drop_temporary=drop_temporary@entry=false, drop_view=drop_view@entry=false, drop_sequence=drop_sequence@entry=true, dont_log_query=false, dont_free_locks=false) at /test/10.6_dbg/sql/sql_table.cc:2446
#10 0x000055b1cf1ffe2e in mysql_rm_table (thd=thd@entry=0x14d658000db8, tables=tables@entry=0x14d658012728, if_exists=<optimized out>, drop_temporary=<optimized out>, drop_sequence=<optimized out>, dont_log_query=dont_log_query@entry=false) at /test/10.6_dbg/sql/sql_table.cc:2143
#11 0x000055b1cf13d646 in mysql_execute_command (thd=thd@entry=0x14d658000db8) at /test/10.6_dbg/sql/structs.h:559
#12 0x000055b1cf126fd2 in mysql_parse (thd=thd@entry=0x14d658000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x14d6a815c3d0) at /test/10.6_dbg/sql/sql_parse.cc:7833
#13 0x000055b1cf1350c7 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x14d658000db8, packet=packet@entry=0x14d658008ce9 "DROP SEQUENCE t", packet_length=packet_length@entry=15) at /test/10.6_dbg/sql/sql_class.h:1253
#14 0x000055b1cf1383d2 in do_command (thd=0x14d658000db8) at /test/10.6_dbg/sql/sql_parse.cc:1343
#15 0x000055b1cf292994 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55b1d37d74d8, put_in_cache=put_in_cache@entry=true) at /test/10.6_dbg/sql/sql_connect.cc:1410
#16 0x000055b1cf29309b in handle_one_connection (arg=arg@entry=0x55b1d37d74d8) at /test/10.6_dbg/sql/sql_connect.cc:1312
#17 0x000055b1cf746abb in pfs_spawn_thread (arg=0x55b1d3723688) at /test/10.6_dbg/storage/perfschema/pfs.cc:2201
#18 0x000014d6b2cd5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#19 0x000014d6b28c4293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.5.7 (dbg), 10.6.0 (dbg)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.1.48 (dbg), 10.1.48 (opt), 10.2.35 (dbg), 10.2.35 (opt), 10.3.26 (dbg), 10.3.26 (opt), 10.4.16 (dbg), 10.4.16 (opt), 10.5.7 (opt), 10.6.0 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.50 (dbg), 5.6.50 (opt), 5.7.32 (dbg), 5.7.32 (opt), 8.0.22 (dbg), 8.0.22 (opt)



 Comments   
Comment by Roel Van de Paar [ 2020-12-09 ]

Repeating the last testcase in this bug in the CLI will make both 10.5 and 10.6 optimized builds hang . Re-entry with secondary CLI connection is possible and processlist output below. mysqladmin shutdown will also hang when attempted at this point.

CREATE TABLE t (c INT);                                                                            
LOCK TABLES t READ LOCAL;                                                                          
CREATE TEMPORARY TABLE t (a INT) SELECT 1 AS a;                                                    
DROP SEQUENCE t;   

Will lead to:

10.6.0 5d4599f9750140f92cfdbbe4d292ae1b8dd456f8 (Optimized)

10.6.0>show processlist;
+----+------+-----------+------+---------+------+----------------------+------------------+----------+
| Id | User | Host      | db   | Command | Time | State                | Info             | Progress |
+----+------+-----------+------+---------+------+----------------------+------------------+----------+
|  4 | root | localhost | test | Query   |  239 | checking permissions | DROP SEQUENCE t  |    0.000 |
|  5 | root | localhost | test | Query   |    0 | starting             | show processlist |    0.000 |
+----+------+-----------+------+---------+------+----------------------+------------------+----------+
2 rows in set (0.000 sec)

Also seen (note killed):

10.5.7 e3fc9c1db04bdc5566bcdafcc3d5159199089c00 (Optimized)

10.5.7>show processlist;  
+----+------+-----------+------+---------+------+----------------------+------------------+----------+
| Id | User | Host      | db   | Command | Time | State                | Info             | Progress |
+----+------+-----------+------+---------+------+----------------------+------------------+----------+
|  4 | root | localhost | test | Killed  |  224 | checking permissions | DROP SEQUENCE t  |    0.000 |
|  8 | root | localhost | test | Query   |    0 | starting             | show processlist |    0.000 |
+----+------+-----------+------+---------+------+----------------------+------------------+----------+
2 rows in set (0.000 sec)

Also confirmed 10.3 + 10.4 optimized are not affected by this hang.

Comment by Vladislav Vaintroub [ 2021-04-28 ]

This does not happen anymore, somewhere between 10.5.9 and now some happy patch fixed that. Added test case, to catch in case it happens again

Comment by Roel Van de Paar [ 2021-04-29 ]

Confirmed. Outcome is now;

10.6.1 9db14e93acc4ec9023d50686c66dbef7d4d8c15c (Debug)

10.6.1-dbg>CREATE TABLE t (c INT);
Query OK, 0 rows affected (0.012 sec)
 
10.6.1-dbg>LOCK TABLES t READ LOCAL;
Query OK, 0 rows affected (0.001 sec)
 
10.6.1-dbg>CREATE TEMPORARY TABLE t (a INT) SELECT 1 AS a;
Query OK, 1 row affected (0.002 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
10.6.1-dbg>DROP SEQUENCE t;
ERROR 4091 (42S02): Unknown SEQUENCE: 'test.t'
10.6.1-dbg>

Same result on 10.6.1 optimized build.
Same result on 10.5.10, for both dbg+opt.

Comment by Roel Van de Paar [ 2021-04-29 ]

wlad Interestingly, 10.4.19 gives a different error code and different message:

10.4.19 9e6310e323a3abd411f852ff57fe41c9f149ff3a (Debug)

10.4.19-dbg>DROP SEQUENCE t; 
ERROR 4090 (42S02): 'test.t' is not a SEQUENCE

Should I log a ticket for consolidation of the two?

Comment by Vladislav Vaintroub [ 2021-04-29 ]

Roel, that's your choice, but why you are asking me, I'm not a sequence guy..
there are 3 error messages basically telling the same thing

ER_NOT_SEQUENCE 42S02
        eng "'%-.64s.%-.64s' is not a SEQUENCE"
ER_NOT_SEQUENCE2 42S02
        eng "'%-.192s' is not a SEQUENCE"
ER_UNKNOWN_SEQUENCES 42S02
        eng "Unknown SEQUENCE: '%-.300s'"

All introduced by the same person , on the same day

C:\work\10.5>git blame sql\share\errmsg-utf8.txt |findstr SEQUENCE
Blaming lines: 100% (7979/7979), done.
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7760) ER_SEQUENCE_RUN_OUT
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7762) ER_SEQUENCE_INVALID_DATA
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7764) ER_SEQUENCE_INVALID_TABLE_STRUCTURE
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7766) ER_SEQUENCE_ACCESS_ERROR
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7768) ER_SEQUENCE_BINLOG_FORMAT
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7770) ER_NOT_SEQUENCE 42S02
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7771)         eng "'%-.64s.%-.64s' is not a SEQUENCE"
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7772) ER_NOT_SEQUENCE2 42S02
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7773)         eng "'%-.192s' is not a SEQUENCE"
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7774) ER_UNKNOWN_SEQUENCES 42S02
17a87d606302 (Monty                              2017-03-25 23:36:56 +0200 7775)         eng "Unknown SEQUENCE: '%-.300s'"
6a779a6d28fa (Monty                              2017-05-23 17:18:31 +0300 7778) ER_WRONG_INSERT_INTO_SEQUENCE

May there is a logic in it, and it makes sense to ask someone who created it, first.

Comment by Roel Van de Paar [ 2021-04-29 ]

I am re-opening this bug, as something is not clear with the 'fix', as follows:

Simply executing DROP SEQUENCE t; against recent (all build this week) 10.3, 10.4, 10.5 and 10.6 instances always returns the same error:

10.3.29-dbg>DROP SEQUENCE t;
ERROR 4091 (42S02): Unknown SEQUENCE: 'test.t'
10.4.19-dbg>DROP SEQUENCE t;
ERROR 4091 (42S02): Unknown SEQUENCE: 'test.t'
10.5.10-dbg>DROP SEQUENCE t;
ERROR 4091 (42S02): Unknown SEQUENCE: 'test.t'
10.6.1-dbg>DROP SEQUENCE t;
ERROR 4091 (42S02): Unknown SEQUENCE: 'test.t'

However, when the testcase above is used, the outcomes are different between 10.3/10.4 versus 10.5/10.6:

(The following outcomes assume that the full original testcase above is used instead of only DROP SEQUENCE):

10.3.29-dbg>DROP SEQUENCE t;
ERROR 4090 (42S02): 'test.t' is not a SEQUENCE
10.4.19-dbg>DROP SEQUENCE t;
ERROR 4090 (42S02): 'test.t' is not a SEQUENCE
10.5.10-dbg>DROP SEQUENCE t;
ERROR 4091 (42S02): Unknown SEQUENCE: 'test.t'
10.6.1-dbg>DROP SEQUENCE t;
ERROR 4091 (42S02): Unknown SEQUENCE: 'test.t'

This means that some different codepath or similar is used between the versions, and also that, for 10.3 and 10.4, the server is still influenced by the existence of a t table. From experience, something like this is quite likely "crashable" using different SQL, and that most likely on 10.3-10.4.

Can a SEQUENCE developer please check/confirm what the reason is for the different message between 10.4 and 10.5/10.6?

Comment by Roel Van de Paar [ 2021-04-29 ]

One thought is that whatever patch fixed this did not get applied to 10.3/10.4

Comment by Vladislav Vaintroub [ 2021-04-29 ]

Another thought is , whatever patch it was, it did not crash in 10.3-10.4

Generated at Thu Feb 08 09:29:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.