[MDEV-31941] Assertion failures upon using OQGraph table under lock Created: 2023-08-17  Updated: 2023-11-28

Status: Confirmed
Project: MariaDB Server
Component/s: Locking, Storage Engine - InnoDB, Storage Engine - OQGRAPH
Affects Version/s: 10.4, 10.5, 10.6, 10.10, 10.11, 11.0, 11.1
Fix Version/s: 10.4, 10.5, 10.6, 10.11, 11.0, 11.1

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Oleksandr Byelkin
Resolution: Unresolved Votes: 1
Labels: None

Issue Links:
Relates
relates to MDEV-24035 Failing assertion: UT_LIST_GET_LEN(lo... Open
relates to MDEV-25163 Rowid filter does not process storage... Closed
relates to MDEV-31393 partial server freeze after LOCK TABL... Stalled
relates to MDEV-32448 MariaDB 10.6.11 - InnoDB: Assertion f... Closed

 Description   

--source include/have_innodb.inc
 
INSTALL SONAME 'ha_oqgraph';
 
CREATE TABLE oq_backing (origid INT UNSIGNED NOT NULL, destid INT UNSIGNED NOT NULL, PRIMARY KEY (origid, destid), KEY (destid)) ENGINE=InnoDB;
CREATE TABLE oq (latch VARCHAR(32) NULL, origid BIGINT UNSIGNED NULL, destid BIGINT UNSIGNED NULL, weight DOUBLE NULL, seq BIGINT UNSIGNED NULL, linkid BIGINT UNSIGNED NULL, KEY (latch, origid, destid) USING HASH, KEY (latch, destid, origid) USING HASH) ENGINE=OQGRAPH data_table='oq_backing' origid='origid' destid='destid';
LOCK TABLE oq WRITE;
SELECT * FROM oq WHERE latch = 0 AND origid = 18265;
UNLOCK TABLES;
 
# Cleanup
DROP TABLE oq, oq_backing;
UNINSTALL SONAME 'ha_oqgraph';

10.4 900c4d69 non-debug

2023-08-17 14:59:10 0x7f6c6b6ff6c0  InnoDB: Assertion failure in file /data/src/10.4/storage/innobase/trx/trx0trx.cc line 1327
InnoDB: Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0
 
#7  0x000055e8888faa9c in ut_dbg_assertion_failed (expr=expr@entry=0x55e8894e7900 "UT_LIST_GET_LEN(lock.trx_locks) == 0", file=file@entry=0x55e8894e7278 "/data/src/10.4/storage/innobase/trx/trx0trx.cc", line=line@entry=1327) at /data/src/10.4/storage/innobase/ut/ut0dbg.cc:60
#8  0x000055e88902517e in trx_t::commit_in_memory (this=<optimized out>, mtr=0x0) at /data/src/10.4/storage/innobase/trx/trx0trx.cc:1327
#9  0x000055e889021e2b in trx_t::commit (this=this@entry=0x7f6c701cd090) at /data/src/10.4/storage/innobase/trx/trx0trx.cc:1549
#10 0x000055e889022033 in trx_commit_for_mysql (trx=trx@entry=0x7f6c701cd090) at /data/src/10.4/storage/innobase/trx/trx0trx.cc:1681
#11 0x000055e888ef0fe8 in innobase_commit_low (trx=0x7f6c701cd090) at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:4488
#12 innobase_commit_ordered_2 (trx=0x7f6c701cd090, thd=<optimized out>) at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:4617
#13 0x000055e888ef2bd3 in innobase_commit (hton=<optimized out>, thd=thd@entry=0x7f6c1c000c58, commit_trx=commit_trx@entry=true) at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:4732
#14 0x000055e888f00b99 in ha_innobase::external_lock (this=0x7f6c1c134fd0, thd=0x7f6c1c000c58, lock_type=2) at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:15834
#15 0x000055e888c26be9 in handler::ha_external_lock (this=0x7f6c1c134fd0, thd=0x7f6c1c000c58, lock_type=2) at /data/src/10.4/sql/handler.cc:6563
#16 0x000055e888c26d61 in handler::ha_external_lock (this=0x7f6c1c133840, thd=thd@entry=0x7f6c1c000c58, lock_type=lock_type@entry=2) at /data/src/10.4/sql/handler.cc:6563
#17 0x000055e888d1c089 in unlock_external (count=<optimized out>, table=0x7f6c1c1241e8, thd=0x7f6c1c000c58) at /data/src/10.4/sql/lock.cc:713
#18 mysql_unlock_tables (thd=0x7f6c1c000c58, sql_lock=0x7f6c1c1241b8, free_lock=<optimized out>) at /data/src/10.4/sql/lock.cc:434
#19 0x000055e888d1c71e in mysql_unlock_tables (thd=<optimized out>, sql_lock=<optimized out>) at /data/src/10.4/sql/lock.cc:418
#20 0x000055e8889a7f86 in close_thread_tables (thd=thd@entry=0x7f6c1c000c58) at /data/src/10.4/sql/sql_base.cc:1027
#21 0x000055e8889a8cda in Locked_tables_list::unlock_locked_tables (thd=0x7f6c1c000c58, this=0x7f6c1c004820) at /data/src/10.4/sql/sql_base.cc:2491
#22 Locked_tables_list::unlock_locked_tables (this=this@entry=0x7f6c1c004820, thd=thd@entry=0x7f6c1c000c58) at /data/src/10.4/sql/sql_base.cc:2463
#23 0x000055e888a137cf in mysql_execute_command (thd=thd@entry=0x7f6c1c000c58) at /data/src/10.4/sql/sql_parse.cc:5075
#24 0x000055e888a15b31 in mysql_parse (thd=0x7f6c1c000c58, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /data/src/10.4/sql/sql_parse.cc:8010
#25 0x000055e888a181ad in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f6c1c000c58, packet=packet@entry=0x7f6c1c007d69 "UNLOCK TABLES", packet_length=packet_length@entry=13, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /data/src/10.4/sql/sql_parse.cc:1958
#26 0x000055e888a1a52f in do_command (thd=0x7f6c1c000c58) at /data/src/10.4/sql/sql_parse.cc:1378
#27 0x000055e888b13dbe in do_handle_one_connection (connect=connect@entry=0x55e88b746fa8) at /data/src/10.4/sql/sql_connect.cc:1420
#28 0x000055e888b13edd in handle_one_connection (arg=arg@entry=0x55e88b746fa8) at /data/src/10.4/sql/sql_connect.cc:1324
#29 0x000055e888ea6ffb in pfs_spawn_thread (arg=0x55e88b46e318) at /data/src/10.4/storage/perfschema/pfs.cc:1869
#30 0x00007f6c765c8fd4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#31 0x00007f6c766495bc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

10.4 900c4d69 debug

mysqld: /data/src/10.4/storage/innobase/lock/lock0lock.cc:3573: ib_lock_t* lock_table_create(dict_table_t*, ulint, trx_t*, ib_lock_t*): Assertion `!trx->auto_commit || trx->will_lock' failed.
 
#9  0x00007fd81c053df2 in __GI___assert_fail (assertion=0x55c758abcaa0 "!trx->auto_commit || trx->will_lock", file=0x55c758abb280 "/data/src/10.4/storage/innobase/lock/lock0lock.cc", line=3573, function=0x55c758abf980 "ib_lock_t* lock_table_create(dict_table_t*, ulint, trx_t*, ib_lock_t*)") at ./assert/assert.c:101
#10 0x000055c7576f4dc5 in lock_table_create (table=0x618000033508, type_mode=1, trx=0x7fd813fc2908, c_lock=0x0) at /data/src/10.4/storage/innobase/lock/lock0lock.cc:3573
#11 0x000055c7576f7c41 in lock_table (flags=0, table=0x618000033508, mode=LOCK_IX, thr=0x62000003d840) at /data/src/10.4/storage/innobase/lock/lock0lock.cc:4003
#12 0x000055c75797934a in row_search_mvcc (buf=0x6190000de3b8 '\276' <repeats 40 times>, mode=PAGE_CUR_G, prebuilt=0x62000003d108, match_mode=0, direction=0) at /data/src/10.4/storage/innobase/row/row0sel.cc:4647
#13 0x000055c75757afff in ha_innobase::index_read (this=0x61d00024cca8, buf=0x6190000de3b8 '\276' <repeats 40 times>, key_ptr=0x0, key_len=0, find_flag=HA_READ_AFTER_KEY) at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:9403
#14 0x000055c75757e2e8 in ha_innobase::index_first (this=0x61d00024cca8, buf=0x6190000de3b8 '\276' <repeats 40 times>) at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:9785
#15 0x000055c75757e820 in ha_innobase::rnd_next (this=0x61d00024cca8, buf=0x6190000de3b8 '\276' <repeats 40 times>) at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:9878
#16 0x000055c756bc0b4b in handler::ha_rnd_next (this=0x61d00024cca8, buf=0x6190000de3b8 '\276' <repeats 40 times>) at /data/src/10.4/sql/handler.cc:2905
#17 0x00007fd8071bc0b3 in oqgraph3::cursor::seek_to (this=0x60c000030ac0, origid=..., destid=...) at /data/src/10.4/storage/oqgraph/oqgraph_thunk.cc:509
#18 0x00007fd8071b6395 in oqgraph3::edge_iterator::seek (this=0x7fd8073dcc10) at /data/src/10.4/storage/oqgraph/oqgraph_shim.cc:35
#19 0x00007fd8071b685e in oqgraph3::edge_iterator::operator!= (this=0x7fd8073dcc10, x=...) at /data/src/10.4/storage/oqgraph/oqgraph_shim.cc:71
#20 0x00007fd8071737dc in open_query::edges_cursor::fetch_row (this=0x6060000bc9c0, row_info=..., result=...) at /data/src/10.4/storage/oqgraph/graphcore.cc:1183
#21 0x00007fd807171df7 in open_query::oqgraph::fetch_row (this=0x60800000c920, result=...) at /data/src/10.4/storage/oqgraph/graphcore.cc:1039
#22 0x00007fd807163401 in ha_oqgraph::rnd_next (this=0x62300001c128, buf=0x6190000dcab8 "\377") at /data/src/10.4/storage/oqgraph/ha_oqgraph.cc:1095
#23 0x000055c756bc0a4b in handler::ha_rnd_next (this=0x62300001c128, buf=0x6190000dcab8 "\377") at /data/src/10.4/sql/handler.cc:2905
#24 0x000055c756fe5c74 in rr_sequential (info=0x62b0000a5e88) at /data/src/10.4/sql/records.cc:485
#25 0x000055c75625dd2e in READ_RECORD::read_record (this=0x62b0000a5e88) at /data/src/10.4/sql/records.h:70
#26 0x000055c756558edd in join_init_read_record (tab=0x62b0000a5dc0) at /data/src/10.4/sql/sql_select.cc:21837
#27 0x000055c75655218d in sub_select (join=0x62b0000a3338, join_tab=0x62b0000a5dc0, end_of_records=false) at /data/src/10.4/sql/sql_select.cc:20886
#28 0x000055c756550198 in do_select (join=0x62b0000a3338, procedure=0x0) at /data/src/10.4/sql/sql_select.cc:20412
#29 0x000055c7564df027 in JOIN::exec_inner (this=0x62b0000a3338) at /data/src/10.4/sql/sql_select.cc:4605
#30 0x000055c7564dc62e in JOIN::exec (this=0x62b0000a3338) at /data/src/10.4/sql/sql_select.cc:4387
#31 0x000055c7564e06c0 in mysql_select (thd=0x62b00009a208, tables=0x62b0000a1948, wild_num=1, fields=..., conds=0x62b0000a2898, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2149845760, result=0x62b0000a3308, unit=0x62b00009e140, select_lex=0x62b0000a1300) at /data/src/10.4/sql/sql_select.cc:4826
#32 0x000055c7564b12ff in handle_select (thd=0x62b00009a208, lex=0x62b00009e080, result=0x62b0000a3308, setup_tables_done_option=0) at /data/src/10.4/sql/sql_select.cc:442
#33 0x000055c756420827 in execute_sqlcom_select (thd=0x62b00009a208, all_tables=0x62b0000a1948) at /data/src/10.4/sql/sql_parse.cc:6473
#34 0x000055c75640dd3c in mysql_execute_command (thd=0x62b00009a208) at /data/src/10.4/sql/sql_parse.cc:3976
#35 0x000055c756429a77 in mysql_parse (thd=0x62b00009a208, rawbuf=0x62b0000a1228 "SELECT * FROM oq WHERE latch = 0 AND origid = 18265", length=51, parser_state=0x7fd8073df860, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:8010
#36 0x000055c7563ffd42 in dispatch_command (command=COM_QUERY, thd=0x62b00009a208, packet=0x62900029e209 "", packet_length=51, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1857
#37 0x000055c7563fc8b1 in do_command (thd=0x62b00009a208) at /data/src/10.4/sql/sql_parse.cc:1378
#38 0x000055c7567fbe10 in do_handle_one_connection (connect=0x608000000ba8) at /data/src/10.4/sql/sql_connect.cc:1420
#39 0x000055c7567fb727 in handle_one_connection (arg=0x608000000ba8) at /data/src/10.4/sql/sql_connect.cc:1324
#40 0x000055c75746be20 in pfs_spawn_thread (arg=0x615000006208) at /data/src/10.4/storage/perfschema/pfs.cc:1869
#41 0x00007fd81c0a7fd4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#42 0x00007fd81c1285bc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

10.6 ca5c122a non-debug

2023-08-17 15:01:20 0x7f56515546c0  InnoDB: Assertion failure in file /data/src/10.6/storage/innobase/trx/trx0trx.cc line 1252
InnoDB: Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0

10.6 ca5c122a debug

mariadbd: /data/src/10.6/storage/innobase/lock/lock0lock.cc:3319: ib_lock_t* lock_table_create(dict_table_t*, unsigned int, trx_t*, ib_lock_t*): Assertion `!trx->is_autocommit_non_locking()' failed.
230817 15:01:58 [ERROR] mysqld got signal 6 ;



 Comments   
Comment by Marko Mäkelä [ 2023-08-17 ]

Transaction commit in InnoDB as of MySQL 5.6 was optimized so that auto-commit non-locking transactions are treated as a special case.

The assertions here fail because a transaction that supposedly is non-locking is actually acquiring some locks, presumably due to the preceding LOCK TABLES. The non-debug build crashes in UNLOCK TABLES, which will invoke InnoDB transaction commit. The transaction commit would fail to release locks, because the transaction is wrongly flagged as auto-commit non-locking.

This reminds me of MDEV-31393 where LOCK TABLES is causing some weird behaviour. I think that this needs to be analyzed by monty.

Comment by Michael Widenius [ 2023-11-05 ]

When USING LOCK TABLES, all tables that are going to be used, directly or indirectly, should be locked.
OQGRAPH is here using an InnoDB table, oq_backing, that is not explicitly locked.

I checked the ha_oqgraph and it does a LOCK TABLE on InnoDB.

This means that from an API point of view, everything looks correct and this is a problem in InnoDB.
Under LOCK TABLE WRITE there will of course be locks. I do not understand what Marko means with that this should be non-locking.

In 10.6 the assert happens in:
lock_table_create()

I do not know what this function is called. Looks strange (based on the name)
It is also strange that InnoDB tries to do row locks on an empty table.
It looks like the error happens as InnoDB tries to change the row lock to a table lock and then assumes this is a create table.
As everything happens internally in InnoDB, I have to move this ticket back to Marko.
I will talk with him shortly about how to progress

I will talk with Marko t

Comment by Marko Mäkelä [ 2023-11-06 ]

First I paid attention to the following transaction commit:

10.4 01623ac9ea7a721855c1683ca216758f53b227b4

#2  innobase_commit (hton=0x5613c43f4498, thd=0x7f48bc00ae58, commit_trx=false) at /mariadb/10.4m/storage/innobase/handler/ha_innodb.cc:4707
#3  0x00005613c148baa6 in commit_one_phase_2 (thd=0x7f48bc00ae58, all=false, trans=0x7f48bc00e430, is_real_trans=true) at /mariadb/10.4m/sql/handler.cc:1835
#4  0x00005613c148ac3a in ha_commit_trans (thd=thd@entry=0x7f48bc00ae58, all=false) at /mariadb/10.4m/sql/handler.cc:1620
#5  0x00005613c183021a in trans_commit_stmt (thd=thd@entry=0x7f48bc00ae58) at /mariadb/10.4m/sql/transaction.cc:437
#6  0x00005613c16f4094 in mysql_execute_command (thd=thd@entry=0x7f48bc00ae58) at /mariadb/10.4m/sql/sql_parse.cc:6276
#7  0x00005613c16ecc42 in mysql_parse (thd=thd@entry=0x7f48bc00ae58, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7f489eb7f520, is_com_multi=<optimized out>, 
    is_next_command=<optimized out>) at /mariadb/10.4m/sql/sql_parse.cc:8013
#8  0x00005613c16eac77 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f48bc00ae58, packet=packet@entry=0x7f48bc0143a9 "LOCK TABLE oq WRITE", packet_length=packet_length@entry=19, 
    is_com_multi=false, is_next_command=false) at /mariadb/10.4m/sql/sql_parse.cc:1857
#9  0x00005613c16ed0b2 in do_command (thd=0x7f48bc00ae58) at /mariadb/10.4m/sql/sql_parse.cc:1378

This commit will reset the trx_t::will_lock data member. This happens also when I modify the test case to use InnoDB only. In the plain InnoDB case, the trx_t::will_lock would be set later:

10.4 01623ac9ea7a721855c1683ca216758f53b227b4

#0  ha_innobase::start_stmt (this=<optimized out>, thd=0x7fb2dc00ae58, 
    lock_type=TL_READ)
    at /mariadb/10.4m/storage/innobase/handler/ha_innodb.cc:15585
#1  0x000055c76c833c0f in check_lock_and_start_stmt (
    thd=thd@entry=0x7fb2dc00ae58, prelocking_ctx=<optimized out>, 
    table_list=table_list@entry=0x7fb2dc01de80)
    at /mariadb/10.4m/sql/sql_base.cc:5119
#2  0x000055c76c83305a in lock_tables (thd=thd@entry=0x7fb2dc00ae58, 
    tables=tables@entry=0x7fb2dc01de80, count=1, flags=flags@entry=0)
    at /mariadb/10.4m/sql/sql_base.cc:5649
#3  0x000055c76c833e1c in open_and_lock_tables (thd=thd@entry=0x7fb2dc00ae58, 
    options=<optimized out>, tables=tables@entry=0x7fb2dc01de80, derived=true, 
    flags=flags@entry=0, 
    prelocking_strategy=prelocking_strategy@entry=0x7fb2e4080b80)
    at /mariadb/10.4m/sql/sql_base.cc:5353
#4  0x000055c76c8b5940 in open_and_lock_tables (thd=0x7fb2dc00ae58, 
    tables=0x7fb2dc01de80, derived=true, flags=0)
    at /mariadb/10.4m/sql/sql_base.h:503
#5  execute_sqlcom_select (thd=thd@entry=0x7fb2dc00ae58, 
    all_tables=0x7fb2dc01de80) at /mariadb/10.4m/sql/sql_parse.cc:6396
#6  0x000055c76c8af3b7 in mysql_execute_command (thd=thd@entry=0x7fb2dc00ae58)
    at /mariadb/10.4m/sql/sql_parse.cc:3978
#7  0x000055c76c8aac42 in mysql_parse (thd=thd@entry=0x7fb2dc00ae58, 
    rawbuf=<optimized out>, length=<optimized out>, 
    parser_state=parser_state@entry=0x7fb2e4081520, 
    is_com_multi=<optimized out>, is_next_command=<optimized out>)
    at /mariadb/10.4m/sql/sql_parse.cc:8013
#8  0x000055c76c8a8c77 in dispatch_command (command=command@entry=COM_QUERY, 
    thd=thd@entry=0x7fb2dc00ae58, 
    packet=packet@entry=0x7fb2dc0143a9 "SELECT * FROM oq WHERE latch = 0 AND origid = 18265", packet_length=packet_length@entry=51, is_com_multi=false, 
    is_next_command=false) at /mariadb/10.4m/sql/sql_parse.cc:1857

Because there is no ha_oqgraph::start_stmt(), the dummy function in the base class will be executed, and ha_innobase::start_stmt() will not be invoked at all:

10.4 01623ac9ea7a721855c1683ca216758f53b227b4

Thread 3 hit Breakpoint 3.2, handler::start_stmt (this=0x7f6aa40a0f70, thd=0x7f6aa400ae58, lock_type=TL_READ) at /mariadb/10.4m/sql/handler.h:3808
3808	  virtual int start_stmt(THD *thd, thr_lock_type lock_type) {return 0;}

This looks like an OQGraph bug to me. It needs to pass calls to locking-related virtual member functions such as handler::start_stmt() to the underlying storage engine.

Generated at Thu Feb 08 10:27:37 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.