[MDEV-30364] Assertion `thd->mdl_context.is_lock_owner(MDL_key::TABLE, share->db.str, share->table_name.str, MDL_EXCLUSIVE)' failed in TDC_element::flush Created: 2023-01-09  Updated: 2023-11-28

Status: Confirmed
Project: MariaDB Server
Component/s: Locking, Storage Engine - InnoDB
Affects Version/s: 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0
Fix Version/s: 10.4, 10.5, 10.6, 10.11, 11.0

Type: Bug Priority: Critical
Reporter: Roel Van de Paar Assignee: Oleksandr Byelkin
Resolution: Unresolved Votes: 0
Labels: regression

Issue Links:
Problem/Incident
is caused by MDEV-29144 ER_TABLE_SCHEMA_MISMATCH or InnoDB: F... Closed

 Description   

CREATE TABLE t (c INT) ENGINE=InnoDB;
LOCK TABLE t WRITE;
ALTER TABLE t DISCARD TABLESPACE;

Leads to:

11.0.1 b075191ba8598af6aff5549e6e19f6255aef258a (Debug)

mysqld: /test/11.0_dbg/sql/table_cache.cc:1259: void TDC_element::flush(THD*, bool): Assertion `thd->mdl_context.is_lock_owner(MDL_key::TABLE, share->db.str, share->table_name.str, MDL_EXCLUSIVE)' failed.

11.0.1 b075191ba8598af6aff5549e6e19f6255aef258a (Debug)

Core was generated by `/test/MD090123-mariadb-11.0.1-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=22720988378688)
    at ./nptl/pthread_kill.c:44
[Current thread is 1 (Thread 0x14aa2470f640 (LWP 520108))]
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=22720988378688) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=22720988378688) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=22720988378688, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x000014aa3d395476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x000014aa3d37b7f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x000014aa3d37b71b in __assert_fail_base (fmt=0x14aa3d530150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x559025309428 "thd->mdl_context.is_lock_owner(MDL_key::TABLE, share->db.str, share->table_name.str, MDL_EXCLUSIVE)", file=0x559025308f78 "/test/11.0_dbg/sql/table_cache.cc", line=1259, function=<optimized out>) at ./assert/assert.c:92
#6  0x000014aa3d38ce96 in __GI___assert_fail (assertion=0x559025309428 "thd->mdl_context.is_lock_owner(MDL_key::TABLE, share->db.str, share->table_name.str, MDL_EXCLUSIVE)", file=0x559025308f78 "/test/11.0_dbg/sql/table_cache.cc", line=1259, function=0x559025309528 "void TDC_element::flush(THD*, bool)") at ./assert/assert.c:101
#7  0x0000559024a88b48 in TDC_element::flush (this=0x14a9e401beb8, thd=thd@entry=0x14a9e4000d58, mark_flushed=mark_flushed@entry=true) at /test/11.0_dbg/sql/table_cache.cc:1259
#8  0x000055902492ae1c in mysql_discard_or_import_tablespace (thd=thd@entry=0x14a9e4000d58, table_list=table_list@entry=0x14a9e4013218, discard=<optimized out>) at /test/11.0_dbg/sql/sql_table.cc:5650
#9  0x00005590249bb03c in Sql_cmd_discard_import_tablespace::execute (this=0x14a9e4013938, thd=0x14a9e4000d58) at /test/11.0_dbg/sql/sql_alter.cc:593
#10 0x0000559024864f1b in mysql_execute_command (thd=thd@entry=0x14a9e4000d58, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/11.0_dbg/sql/sql_parse.cc:6001
#11 0x0000559024866934 in mysql_parse (thd=thd@entry=0x14a9e4000d58, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x14aa2470e2c0) at /test/11.0_dbg/sql/sql_parse.cc:8000
#12 0x0000559024868ac8 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x14a9e4000d58, packet=packet@entry=0x14a9e400ae09 "ALTER TABLE t DISCARD TABLESPACE", packet_length=packet_length@entry=32, blocking=blocking@entry=true) at /test/11.0_dbg/sql/sql_class.h:243
#13 0x000055902486a921 in do_command (thd=0x14a9e4000d58, blocking=blocking@entry=true) at /test/11.0_dbg/sql/sql_parse.cc:1407
#14 0x00005590249b49ea in do_handle_one_connection (connect=<optimized out>, connect@entry=0x559027b82998, put_in_cache=put_in_cache@entry=true) at /test/11.0_dbg/sql/sql_connect.cc:1416
#15 0x00005590249b4c4e in handle_one_connection (arg=0x559027b82998) at /test/11.0_dbg/sql/sql_connect.cc:1318
#16 0x000014aa3d3e7b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#17 0x000014aa3d479a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Bug confirmed present in:
MariaDB: 10.3.38 (dbg), 10.4.28 (dbg), 10.5.19 (dbg), 10.6.12 (dbg), 10.7.8 (dbg), 10.8.7 (dbg), 10.9.5 (dbg), 10.10.3 (dbg), 10.11.2 (dbg), 11.0.1 (dbg)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.3.38 (opt), 10.4.28 (opt), 10.5.19 (opt), 10.6.12 (opt), 10.7.8 (opt), 10.8.7 (opt), 10.9.5 (opt), 10.10.3 (opt), 10.11.2 (opt), 11.0.1 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.40 (dbg), 5.7.40 (opt), 8.0.31 (dbg), 8.0.31 (opt)



 Comments   
Comment by Roel Van de Paar [ 2023-01-09 ]

Caused by:

commit 782b2a750067a12be07b9c305ede4d2c28f173e0
Author: Marko Mäkelä <marko.makela@mariadb.com>
Date:   Fri Dec 9 10:42:19 2022 +0200
 
    MDEV-29144 ER_TABLE_SCHEMA_MISMATCH or crash on DISCARD/IMPORT

Comment by Marko Mäkelä [ 2023-01-25 ]

wlad, do I interpret the assertion correctly that it complains that ALTER TABLE…DISCARD TABLESPACE is not holding MDL_EXCLUSIVE on the table name? That would seem to be a very reasonable assumption. After all, the operation is similar to DROP TABLE or TRUNCATE TABLE, or any table-rebuilding ALTER TABLE.

Comment by Vladislav Vaintroub [ 2023-01-25 ]

marko Right. Seems to be caused by the change you made, maybe you should have a look.

Comment by Marko Mäkelä [ 2023-03-29 ]

Based on my reading of the high-level code in mysql_discard_or_import_tablespace(), an exclusive metadata lock is being requested, yet the assertion fails. I suppose that the LOCK TABLE t WRITE has some impact on the lock acquisition:

  table_list->mdl_request.set_type(MDL_EXCLUSIVE);
  table_list->lock_type= TL_WRITE;
  /* Do not open views. */
  table_list->required_type= TABLE_TYPE_NORMAL;
 
  if (open_and_lock_tables(thd, table_list, FALSE, 0,
                           &alter_prelocking_strategy))
  {
    thd->tablespace_op=FALSE;
    DBUG_RETURN(-1);
  }
  if (discard)
    tdc_remove_table(thd, TDC_RT_REMOVE_NOT_OWN, table_list->table->s->db.str,
                     table_list->table->s->table_name.str, true);

I think that this needs to be analyzed and fixed by whoever is familiar with the table definition cache and the locking. This has little to do with InnoDB, which I am familiar with.

Generated at Thu Feb 08 10:15:42 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.