[MDEV-25267] Reported latching order violation in ibuf_remove_free_page() Created: 2021-03-26  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 5.5, 10.0, 10.1, 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.4, 10.5

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Marko Mäkelä
Resolution: Unresolved Votes: 0
Labels: debug, not-10.6, not-11.0, rr-profile-analyzed

Issue Links:
Relates
relates to MDEV-13485 MTR tests fail massively with --innod... Closed
relates to MDEV-13637 InnoDB change buffer housekeeping can... Closed
relates to MDEV-21452 Use condition variables and normal mu... Closed
relates to MDEV-24142 rw_lock_t has unnecessarily complex w... Closed
relates to MDEV-24167 InnoDB unnecessarily uses complex rw-... Closed
relates to MDEV-29694 Remove the InnoDB change buffer Closed

 Description   

RQG test with server option --mysqld=--loose-innodb-sync-debug
[rr 3607879 378110]InnoDB: Failing assertion: latches->empty() || level == SYNC_LEVEL_VARYING || level == SYNC_NO_ORDER_CHECK || latches->back().get_level() == SYNC_NO_ORDER_CHECK || latches->back().m_latch->get_level() == SYNC_LEVEL_VARYING || latches->back().get_level() >= level
(rr) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f7cfe43f859 in __GI_abort () at abort.c:79
#2  0x000055e5188ab349 in ut_dbg_assertion_failed (
    expr=0x55e519bbd280 "latches->empty() || level == SYNC_LEVEL_VARYING || level == SYNC_NO_ORDER_CHECK || latches->back().get_level() == SYNC_NO_ORDER_CHECK || latches->back().m_latch->get_level() == SYNC_LEVEL_VARYING || l"..., 
    file=0x55e519bbd1c0 "/Server/bb-10.5-MDEV-24589/storage/innobase/sync/sync0debug.cc", line=219) at /Server/bb-10.5-MDEV-24589/storage/innobase/ut/ut0dbg.cc:60
#3  0x000055e518807653 in LatchDebug::lock_validate (this=0x60d000000040, latch=0x7f7cddd09320, level=SYNC_IBUF_PESS_INSERT_MUTEX) at /Server/bb-10.5-MDEV-24589/storage/innobase/sync/sync0debug.cc:219
#4  0x000055e518804371 in sync_check_lock_validate (latch=0x7f7cddd09320) at /Server/bb-10.5-MDEV-24589/storage/innobase/sync/sync0debug.cc:1057
#5  0x000055e5183e3cbe in MutexDebug<TTASEventMutex<GenericPolicy> >::enter (this=0x55e51b7ca030 <ibuf_pessimistic_insert_mutex+16>, mutex=..., filename=0x55e519a24680 "/Server/bb-10.5-MDEV-24589/storage/innobase/ibuf/ibuf0ibuf.cc", line=1918)
    at /Server/bb-10.5-MDEV-24589/storage/innobase/include/sync0policy.h:125
#6  0x000055e5183db874 in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x55e51b7ca020 <ibuf_pessimistic_insert_mutex>, n_spins=30, n_delay=4, name=0x55e519a24680 "/Server/bb-10.5-MDEV-24589/storage/innobase/ibuf/ibuf0ibuf.cc", 
    line=1918) at /Server/bb-10.5-MDEV-24589/storage/innobase/include/ib0mutex.h:588
#7  0x000055e5184dc067 in ibuf_remove_free_page () at /Server/bb-10.5-MDEV-24589/storage/innobase/ibuf/ibuf0ibuf.cc:1918
#8  0x000055e5184dc819 in ibuf_free_excess_pages () at /Server/bb-10.5-MDEV-24589/storage/innobase/ibuf/ibuf0ibuf.cc:2019
#9  0x000055e5186b7e15 in row_ins_sec_index_entry (index=0x617000037a08, entry=0x61b00033b988, thr=0x6210027f50d0, check_foreign=true) at /Server/bb-10.5-MDEV-24589/storage/innobase/row/row0ins.cc:3296
#10 0x000055e5186b804d in row_ins_index_entry (index=0x617000037a08, entry=0x61b00033b988, thr=0x6210027f50d0) at /Server/bb-10.5-MDEV-24589/storage/innobase/row/row0ins.cc:3335
#11 0x000055e5186b907b in row_ins_index_entry_step (node=0x6230001934f0, thr=0x6210027f50d0) at /Server/bb-10.5-MDEV-24589/storage/innobase/row/row0ins.cc:3502
#12 0x000055e5186b9a38 in row_ins (node=0x6230001934f0, thr=0x6210027f50d0) at /Server/bb-10.5-MDEV-24589/storage/innobase/row/row0ins.cc:3661
#13 0x000055e5186bab66 in row_ins_step (thr=0x6210027f50d0) at /Server/bb-10.5-MDEV-24589/storage/innobase/row/row0ins.cc:3807
#14 0x000055e5186fdbf3 in row_insert_for_mysql (mysql_rec=0x626000129138 "\347\377\376", prebuilt=0x623000192988, ins_mode=ROW_INS_NORMAL) at /Server/bb-10.5-MDEV-24589/storage/innobase/row/row0mysql.cc:1418
#15 0x000055e51838794c in ha_innobase::write_row (this=0x61d0007d28b8, record=0x626000129138 "\347\377\376") at /Server/bb-10.5-MDEV-24589/storage/innobase/handler/ha_innodb.cc:7673
#16 0x000055e517a8eb90 in handler::ha_write_row (this=0x61d0007d28b8, buf=0x626000129138 "\347\377\376") at /Server/bb-10.5-MDEV-24589/sql/handler.cc:7153
#17 0x000055e51724f169 in write_record (thd=0x62b000149218, table=0x61900033f998, info=0x7f7cddd0aff0, sink=0x0) at /Server/bb-10.5-MDEV-24589/sql/sql_insert.cc:2106
#18 0x000055e517247b45 in mysql_insert (thd=0x62b000149218, table_list=0x62b000118588, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false, result=0x0)
    at /Server/bb-10.5-MDEV-24589/sql/sql_insert.cc:1099
#19 0x000055e517301ed0 in mysql_execute_command (thd=0x62b000149218) at /Server/bb-10.5-MDEV-24589/sql/sql_parse.cc:4622
#20 0x000055e517318b86 in mysql_parse (thd=0x62b000149218, 
    rawbuf=0x62b000118238 "INSERT INTO `table100_innodb_int_autoinc` ( `col_set_ucs2`, `col_char_255_utf8_key`, `col_varchar_255_ucs2_key` ) VALUES ( REPEAT( _ASCII 0x1DAF, 15 ), REPEAT( _LATIN1 0x30E, 172 ), REPEAT( _UCS2 0xC,"..., length=245, 
    parser_state=0x7f7cddd0bcc0, is_com_multi=false, is_next_command=false) at /Server/bb-10.5-MDEV-24589/sql/sql_parse.cc:8093
#21 0x000055e5172f1035 in dispatch_command (command=COM_QUERY, thd=0x62b000149218, 
    packet=0x629000c30219 "INSERT INTO `table100_innodb_int_autoinc` ( `col_set_ucs2`, `col_char_255_utf8_key`, `col_varchar_255_ucs2_key` ) VALUES ( REPEAT( _ASCII 0x1DAF, 15 ), REPEAT( _LATIN1 0x30E, 172 ), REPEAT( _UCS2 0xC,"..., 
    packet_length=246, is_com_multi=false, is_next_command=false) at /Server/bb-10.5-MDEV-24589/sql/sql_parse.cc:1889
#22 0x000055e5172ee0c4 in do_command (thd=0x62b000149218) at /Server/bb-10.5-MDEV-24589/sql/sql_parse.cc:1370
#23 0x000055e5176cd6df in do_handle_one_connection (connect=0x6080000030b8, put_in_cache=true) at /Server/bb-10.5-MDEV-24589/sql/sql_connect.cc:1410
#24 0x000055e5176cd048 in handle_one_connection (arg=0x6080000030b8) at /Server/bb-10.5-MDEV-24589/sql/sql_connect.cc:1312
#25 0x00007f7cfe968609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#26 0x00007f7cfe53c293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(rr)
Query (0x62b000118238): INSERT INTO `table100_innodb_int_autoinc` ( `col_set_ucs2`, `col_char_255_utf8_key`, `col_varchar_255_ucs2_key` ) VALUES ( REPEAT( _ASCII 0x1DAF, 15 ), REPEAT( _LATIN1 0x30E, 172 ), REPEAT( _UCS2 0xC, 137 ) )
Status: KILL_TIMEOUT
 
sdp:/home/mleich/RQG_O/storage/1616623399/TBR-966/dev/shm/vardir/1616623399/6/1/rr
 
RQG
------
git clone https://github.com/mleich1/rqg --branch experimental RQG
 
perl rqg.pl \
--grammar=conf/engines/many_indexes.yy \
--gendata=conf/engines/many_indexes.zz \
--reporters=Mariabackup_linux \
--mysqld=--innodb_use_native_aio=1 \
--mysqld=--innodb_lock_schedule_algorithm=fcfs \
--mysqld=--loose-idle_write_transaction_timeout=0 \
--mysqld=--loose-idle_transaction_timeout=0 \
--mysqld=--loose-idle_readonly_transaction_timeout=0 \
--mysqld=--connect_timeout=60 \
--mysqld=--interactive_timeout=28800 \
--mysqld=--slave_net_timeout=60 \
--mysqld=--net_read_timeout=30 \
--mysqld=--net_write_timeout=60 \
--mysqld=--loose-table_lock_wait_timeout=50 \
--mysqld=--wait_timeout=28800 \
--mysqld=--lock-wait-timeout=86400 \
--mysqld=--innodb-lock-wait-timeout=50 \
--no-mask \
--queries=10000000 \
--seed=random \
--reporters=Backtrace \
--reporters=ErrorLog \
--reporters=Deadlock1 \
--validators=None \
--mysqld=--log_output=none \
--mysqld=--log-bin \
--mysqld=--log_bin_trust_function_creators=1 \
--mysqld=--loose-debug_assert_on_not_freed_memory=0 \
--engine=InnoDB \
--restart_timeout=240 \
--mysqld=--plugin-load-add=file_key_management.so \
--mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
--duration=300 \
--mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
--mysqld=--loose-innodb-sync-debug \
--mysqld=--innodb_stats_persistent=off \
--mysqld=--innodb_adaptive_hash_index=off \
--mysqld=--loose-max-statement-time=30 \
--threads=2 \
--mysqld=--innodb_file_per_table=0 \
--rr=Extended \
--mysqld=--innodb_page_size=4K \
--mysqld=--innodb-buffer-pool-size=5M \
--duration=300 \
--no_mask \
--workdir=<local settings> \
--vardir=<local settings> \
--mtr-build-thread=<local settings> \
--basedir1=<local settings> \
--script_debug=_nix_



 Comments   
Comment by Matthias Leich [ 2021-03-26 ]

Search pattern
[ 'TBR-966--MDEV-25267' , '-innodb-sync-debug.+#.{1,30} in ibuf_free_excess_pages \(\).+InnoDB: Assertion failure in .{1,200}sync0debug.cc line.{1,200}InnoDB: Failing assertion: latches->empty\(\) \|\| level == SYNC_LEVEL_VARYING \|\| level == SYNC_NO_ORDER_CHECK \|\| latches->back\(\).get_level\(\) == SYNC_NO_ORDER_CHECK \|\| latches->back\(\).m_latch->get_level\(\) == SYNC_LEVEL_VARYING \|\| latches->back\(\).get_level\(\) >= level' ],

Comment by Marko Mäkelä [ 2021-03-26 ]

There is a clear latching order violation in the function ibuf_remove_free_page() since the very first public release of InnoDB:

	/* Acquire the fsp latch before the ibuf header, obeying the latching
	order */
	mtr_x_lock(fil_space_get_latch(space), &mtr);
	
	header_page = ibuf_header_page_get(space, &mtr);
 
	/* Prevent pessimistic inserts to insert buffer trees for a while */
	mutex_enter(&ibuf_pessimistic_insert_mutex);
 
	ibuf_enter();
 
	mutex_enter(&ibuf_mutex);

According to the debug information, the ibuf_pessimistic_insert_mutex should have been acquired before the fil_space_t::latch.

Apparently, nobody had run large enough stress tests with innodb_sync_debug (or the earlier UNIV_SYNC_DEBUG compile time option) to catch this failure earlier. In MDEV-13485, some other potential hangs in the change buffer were found and fixed.

So, is there any potential deadlock? Let us check for any operation that can acquire fil_space_t::latch on the system tablespace while holding ibuf_pessimistic_insert_mutex.

The function ibuf_add_free_page() is acquiring a tablespace latch, but it is being called right after ibuf_pessimistic_insert_mutex was released.

It does not look like a genuine hang is possible. But the latching order of ibuf_mutex, ibuf_pessimistic_insert_mutex, ibuf_bitmap_mutex could be reviewed.

We should also note that this code path is all but dead code. We already moved it in MDEV-13637 so that this function would only be called if change buffering is attempted on a secondary index page of a table that resides in the system tablespace. Practically everyone should be using the default setting innodb_file_per_table=1.

In MariaDB 10.6, the instrumentation was removed in MDEV-21452, MDEV-24167, MDEV-24142. For mutexes we gained SAFE_MUTEX instrumentation, but for rw-locks we lost any latching order instrumentation.

Generated at Thu Feb 08 09:36:24 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.