[MDEV-29883] Deadlock between InnoDB statistics update and BLOB insert Created: 2022-10-26  Updated: 2023-03-02  Resolved: 2022-10-26

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11
Fix Version/s: 10.6.11, 10.7.7, 10.8.6, 10.9.4, 10.10.2, 10.11.1, 10.3.38, 10.4.28, 10.5.19

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: hang, upstream

Issue Links:
Blocks
blocks MDEV-29835 Partial server freeze Closed
is blocked by MDEV-15020 Server hangs due to InnoDB persistent... Closed
Relates
relates to MDEV-30638 Deadlock due to updating InnoDB stati... Closed
relates to MDEV-15020 Server hangs due to InnoDB persistent... Closed
relates to MDEV-29660 [ERROR] [FATAL] InnoDB: innodb_fatal_... Closed

 Description   

Today, I was lucky and got the test innodb.innodb-wl5522-debug hanging once. The deadlock would seem to be between these two threads:

Thread 10 (Thread 0x7ff9beffd640 (LWP 1487571) "mariadbd"):
...
#5  0x000055770b6b3c90 in fil_space_t::x_lock (this=0x7ff9940c6fa0) at /mariadb/10.6/storage/innobase/include/fil0fil.h:1058
#6  0x000055770b7d1996 in mtr_t::x_lock_space (this=this@entry=0x7ff9beffc3a0, space=0x7ff9940c6fa0) at /mariadb/10.6/storage/innobase/mtr/mtr0mtr.cc:816
#7  0x000055770b73d3bf in dict_stats_analyze_index (index=index@entry=0x7ff9940c4740) at /mariadb/10.6/storage/innobase/dict/dict0stats.cc:2573
#8  0x000055770b73dde7 in dict_stats_update_persistent (table=table@entry=0x7ff99405ecf0) at /mariadb/10.6/storage/innobase/dict/dict0stats.cc:2879
#9  0x000055770b73ea92 in dict_stats_update (table=table@entry=0x7ff99405ecf0, stats_upd_option=stats_upd_option@entry=DICT_STATS_RECALC_PERSISTENT) at /mariadb/10.6/storage/innobase/dict/dict0stats.cc:3929
#10 0x000055770b740c84 in dict_stats_process_entry_from_recalc_pool (thd=thd@entry=0x7ff98c001168) at /mariadb/10.6/storage/innobase/dict/dict0stats_bg.cc:343
#11 0x000055770b740d83 in dict_stats_func () at /mariadb/10.6/storage/innobase/dict/dict0stats_bg.cc:382
...
Thread 8 (Thread 0x7ff9de2b8640 (LWP 1485618) "mariadbd"):
...
#5  0x000055770b6a11c8 in sux_lock<ssux_lock_impl<true> >::u_lock (this=0x7ff9de640d58) at /mariadb/10.6/storage/innobase/include/sux_lock.h:378
#6  0x000055770b724a14 in btr_page_alloc_low (index=index@entry=0x7ff9940c4740, hint_page_no=hint_page_no@entry=29, file_direction=file_direction@entry=113 'q', level=level@entry=0, mtr=mtr@entry=0x7ff9de2b4fd0, init_mtr=init_mtr@entry=0x7ff9de2b4fd0, err=0x7ff9de2b4f18) at /mariadb/10.6/storage/innobase/btr/btr0btr.cc:531
#7  0x000055770b724a80 in btr_page_alloc (index=index@entry=0x7ff9940c4740, hint_page_no=hint_page_no@entry=29, file_direction=file_direction@entry=113 'q', level=level@entry=0, mtr=mtr@entry=0x7ff9de2b4fd0, init_mtr=init_mtr@entry=0x7ff9de2b4fd0, err=0x7ff9de2b4f18) at /mariadb/10.6/storage/innobase/btr/btr0btr.cc:566
#8  0x000055770b6fe462 in btr_store_big_rec_extern_fields (pcur=pcur@entry=0x7ff9de2b54e0, offsets=offsets@entry=0x7ff9de2b5c10, big_rec_vec=big_rec_vec@entry=0x7ff9942132b0, btr_mtr=btr_mtr@entry=0x7ff9de2b55d0, op=op@entry=BTR_STORE_INSERT) at /mariadb/10.6/storage/innobase/btr/btr0cur.cc:6936
#9  0x000055770b83063e in row_ins_index_entry_big_rec (entry=entry@entry=0x7ff994064540, big_rec=0x7ff9942132b0, offsets=0x7ff9de2b5c10, heap=heap@entry=0x7ff9de2b5b08, index=index@entry=0x7ff9940c4740, thd=0x7ff994000d58) at /mariadb/10.6/storage/innobase/row/row0ins.cc:2486
...
#32 0x000055770b3b18f8 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7ff994000d58, packet=packet@entry=0x7ff99400ad29 "INSERT INTO t1\nSELECT 100, REPEAT('Karanbir', 128), REPEAT('Ajeeth', 1200)\nFROM seq_1_to_256", packet_length=packet_length@entry=92, blocking=blocking@entry=true) at /mariadb/10.6/sql/sql_parse.cc:1896

This was with some patches, removing the flags BTR_LATCH_FOR_INSERT and BTR_MODIFY_EXTERNAL. I attempted to run the test 300 more times, but the hang was not reproduced.

Thread 8 (writing a BLOB) is holding the clustered index U-latch and the tablespace U-latch or X-latch. Thread 10 (updating persistent statistics) in dict_stats_analyze_index() had executed the following code:

	mtr.start();
	mtr_s_lock_index(index, &mtr);
	dberr_t err;
        buf_block_t* root = btr_root_block_get(index, RW_SX_LATCH, &mtr, &err);
	if (!root) {
empty_index:
		mtr.commit();
		dict_stats_assert_initialized_index(index);
		DBUG_RETURN(result);
	}
 
	uint16_t root_level = btr_page_get_level(root->page.frame);
	mtr.x_lock_space(index->table->space);

That is, it is holding an index S-latch (which does not conflict with the U latch that the BLOB insert is holding), the clustered index root page latch, and waiting for an exclusive latch on the tablespace.

It was served an additional portion of luck, and the following hung on the first try:

./mtr --rr innodb.innodb-wl5522-debug

After killall -ABRT mariadbd I got a nice trace of the hang between an INSERT of a BLOB and an update of statistics. The BLOB write had acquired the tablespace latch before trying to acquire any further page latches:

#3  0x000055692e64c996 in mtr_t::x_lock_space (this=this@entry=0x7fd01cefbfd0, space=space@entry=0x7fd00c107c70) at /mariadb/10.6/storage/innobase/mtr/mtr0mtr.cc:816
#4  0x000055692e537fe1 in fsp_reserve_free_extents (n_reserved=n_reserved@entry=0x7fd01cefbf1c, space=0x7fd00c107c70, n_ext=n_ext@entry=1, alloc_type=alloc_type@entry=FSP_BLOB, mtr=mtr@entry=0x7fd01cefbfd0, 
    n_pages=n_pages@entry=1) at /mariadb/10.6/storage/innobase/fsp/fsp0fsp.cc:2391
#5  0x000055692e5793d2 in btr_store_big_rec_extern_fields (pcur=pcur@entry=0x7fd01cefc4e0, offsets=offsets@entry=0x7fd00c0f22c8, big_rec_vec=big_rec_vec@entry=0x7fd00c0e1ef0, 
    btr_mtr=btr_mtr@entry=0x7fd01cefc5d0, op=op@entry=BTR_STORE_INSERT) at /mariadb/10.6/storage/innobase/btr/btr0cur.cc:6927

Hence, the culprit for invalid latching order must be dict_stats_analyze_index().

I think that I saw some waits for fil_space_t::latch in some core dumps for MDEV-29835. Therefore, this bug might explain at least some of the hangs for which MDEV-29835 was filed.



 Comments   
Comment by Marko Mäkelä [ 2022-10-26 ]

Before MDEV-26623, dict_stats_analyze_index() did the following:

	mtr.start();
	mtr_s_lock_index(index, &mtr);
	size = btr_get_size(index, BTR_TOTAL_SIZE, &mtr);
 
	if (size != ULINT_UNDEFINED) {
		result.index_size = size;
		size = btr_get_size(index, BTR_N_LEAF_PAGES, &mtr);
	}
 
	/* Release the X locks on the root page taken by btr_get_size() */
	mtr.commit();
 
	switch (size) {
	case ULINT_UNDEFINED:
		dict_stats_assert_initialized_index(index);
		DBUG_RETURN(result);
	case 0:
		/* The root node of the tree is a leaf */
		size = 1;
	}
 
	result.n_leaf_pages = size;
 
	mtr.start();
	mtr_sx_lock_index(index, &mtr);
	size = btr_get_size(index, BTR_TOTAL_SIZE, &mtr);
 
	if (size != ULINT_UNDEFINED) {
		result.index_size = size;
		size = btr_get_size(index, BTR_N_LEAF_PAGES, &mtr);
	}
 
	/* Release the X locks on the root page taken by btr_get_size() */
	mtr.commit();

In MDEV-26623 those 2 mini-transactions were combined into 1 and the function btr_get_size() was replaced with direct calls to fseg_n_reserved_pages(). The latching order violation was present in the old btr_get_size() as well:

	buf_block_t* root = btr_root_block_get(index, RW_SX_LATCH, mtr);
	if (!root) {
		return ULINT_UNDEFINED;
	}
	mtr_x_lock_space(index->table->space, mtr);

That call was moved from fseg_n_reserved_pages() to btr_get_size() in MDEV-21174. I confirmed that this latching order violation is present already in the very first public release of InnoDB. It might have been unreachable until several concurrency improvements were made, such as splitting the kernel_mutex; I did not check that.

Before MDEV-15020 was fixed in 10.6, innodb_stats_auto_recalc could cause other hangs more easily.

Comment by Marko Mäkelä [ 2022-10-26 ]

This hang may have been introduced by WL#6326 in MySQL 5.7 and MariaDB Server 10.2.2. Before that change, the shared index latch that had to be held upon calling btr_get_size() would have conflicted with an exclusive latch that had to be held by BLOB operations. When the new dict_index_t::lock mode (SX a.k.a. Update in MDEV-24142) was introduced, the latching requirements of btr_get_size() should have been adjusted as well.

Comment by Marko Mäkelä [ 2022-10-26 ]

It looks like the hangs should be possible also with the InnoDB non-persistent statistics. The fix for older versions than 10.6 would be a little different.

Comment by Marko Mäkelä [ 2022-11-09 ]

In the fix for older release series than 10.6, I retained the exclusive fil_space_t::latch acquisition. Only the dict_index_t::lock mode was escalated to exclusive.

Generated at Thu Feb 08 10:12:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.