[MDEV-30638] Deadlock due to updating InnoDB statistics - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.6.11, 10.7.7, 10.8.6, 10.9.4, 10.10.2, 10.11.1, 11.0.0, 10.6.12, 10.7.8, 10.8.7, 10.9.5, 10.10.3
Fix Version/s: 10.7.8, 11.0.1, 10.6.13, 10.8.8, 10.9.6, 10.10.4, 10.11.3
Component/s: Storage Engine - InnoDB
Labels:

Description

This InnoDB deadlock came up while mleich was testing ~~MDEV-26055~~ and ~~MDEV-26827~~. Many threads are blocked waiting on a fil_space_t::latch. The actual deadlock seems to involve buffer page latches and the tablespace latch in at least 3 threads. Two of the threads are attempting to update InnoDB statistics, waiting for a buffer page latch while holding shared fil_space_t::latch:

#7  0x0000558ac484ce42 in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=4, guess=guess@entry=0x0, mode=mode@entry=16, mtr=mtr@entry=0x7f98ea884950, err=0x0, allow_ibuf_merge=false) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/buf/buf0buf.cc:2952

#8  0x0000558ac49e2e18 in fseg_inode_try_get (header=header@entry=0x7f9913bb4054 "", space=<optimized out>, zip_size=<optimized out>, mtr=mtr@entry=0x7f98ea884950, block=block@entry=0x7f98ea884740, err=err@entry=0x0) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/buf0types.h:95

#9  0x0000558ac49e6cf0 in fseg_n_reserved_pages (block=..., header=header@entry=0x7f9913bb4054 "", used=used@entry=0x7f98ea884850, mtr=mtr@entry=0x7f98ea884950) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/buf0types.h:131

#10 0x0000558ac4976846 in dict_stats_update_transient_for_index (index=index@entry=0x616004011a08) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/fil0fil.h:1072

One thread is waiting for an exclusive tablespace latch while holding several buffer page latches:

#5  mtr_t::x_lock_space (this=this@entry=0x7f98eb9e1d60, space=space@entry=0x6120000523c0) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/mtr/mtr0mtr.cc:521

#6  0x0000558ac49eb144 in fsp_reserve_free_extents (n_reserved=n_reserved@entry=0x7f98eb9e1750, space=0x6120000523c0, n_ext=<optimized out>, alloc_type=alloc_type@entry=FSP_NORMAL, mtr=mtr@entry=0x7f98eb9e1d60, n_pages=n_pages@entry=2) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/fsp/fsp0fsp.cc:2398

#7  0x0000558ac47be2ca in btr_cur_pessimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7f98eb9e1960, offsets=offsets@entry=0x7f98eb9e18e0, heap=heap@entry=0x7f98eb9e18c0, entry=entry@entry=0x616003f5da08, rec=rec@entry=0x7f98eb9e1900, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/dict0mem.h:1216

I see that before ~~MDEV-29883~~, dict_stats_update_transient_for_index() used to hold an exclusive tablespace latch, instead of a shared one. It seems to me that this deadlock would have bene possible even without that change. Possibly the probability of the deadlock could have been increased.

I also see that before ~~MDEV-29883~~, dict_stats_update_transient_for_index() only used to hold a shared index latch. It would seem to me that acquiring an even stronger index latch would prevent the observed deadlock:

diff --git a/storage/innobase/dict/dict0stats.cc b/storage/innobase/dict/dict0stats.cc

index 0d630583daa..101424d0f44 100644

--- a/storage/innobase/dict/dict0stats.cc

+++ b/storage/innobase/dict/dict0stats.cc

@@ -1473,7 +1473,7 @@ dict_stats_update_transient_for_index(

 		mtr_t	mtr;

 		mtr.start();

-		mtr_sx_lock_index(index, &mtr);

+		mtr_x_lock_index(index, &mtr);

 		dberr_t err;

 		buf_block_t* root = btr_root_block_get(index, RW_SX_LATCH,

Because a pessimistic insert would start by acquiring an index latch, it would not be able to deadlock with this mini-transaction anymore.

Attachments

Issue Links

relates to

MDEV-29835 Partial server freeze

Closed

MDEV-29883 Deadlock between InnoDB statistics update and BLOB insert

Closed

MDEV-29930 Server hang with innodb_file_per_table=0, innodb_undo_tablespaces=0

Open

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2023-02-10 16:22

Sorry, it is not that simple. Acquiring a stronger index latch does not help this deadlock. The INSERT operation is actually holding an index latch in a mode that would conflict with the statistics operation, if the statistics operation were holding a latch on the same index.

The statistics can be updated for different indexes in the same tablespace than the one that the INSERT is operating on. To prevent this hang, one possible fix could be that dict_stats_update_transient_for_index() and possibly other functions (dict_index_t::clear(), btr_get_size_and_reserved(), dict_stats_save_defrag_stats(), dict_stats_analyze_index()) would acquire a latch on every index of the table, like row_quiesce_set_state() does it, and only then proceeds to acquire a fil_space_t::latch. This approach would still fail to prevent a hang if a table is stored in the InnoDB system tablespace.

Marko Mäkelä added a comment - 2023-02-10 16:22 Sorry, it is not that simple. Acquiring a stronger index latch does not help this deadlock. The INSERT operation is actually holding an index latch in a mode that would conflict with the statistics operation, if the statistics operation were holding a latch on the same index . The statistics can be updated for different indexes in the same tablespace than the one that the INSERT is operating on. To prevent this hang, one possible fix could be that dict_stats_update_transient_for_index() and possibly other functions ( dict_index_t::clear() , btr_get_size_and_reserved() , dict_stats_save_defrag_stats() , dict_stats_analyze_index() ) would acquire a latch on every index of the table, like row_quiesce_set_state() does it, and only then proceeds to acquire a fil_space_t::latch . This approach would still fail to prevent a hang if a table is stored in the InnoDB system tablespace.

Marko Mäkelä added a comment - 2023-02-11 09:02

The relevant latching order here should be as follows:

Acquire dict_index_t::lock in non-shared mode.
Acquire the index root page latch in non-shared mode.
Possibly acquire further index page latches in accordance with the WL#6326 rules explained in ~~MDEV-29835~~.
Acquire the tablespace latch.
Acquire latches on the allocation data structures. These are protected by the tablespace latch.
Possibly allocate and write some pages.

Currently, dict_stats_update_transient_for_index() and possibly other functions fail to acquire the index root page latch before the tablespace latch. Swapping the order of acquisition ought to fix this. In the core dump that I analyzed, dict_stats_update_transient_for_index() was holding a tablespace latch and waiting for the index root page latch.

Marko Mäkelä added a comment - 2023-02-11 09:02 The relevant latching order here should be as follows: Acquire dict_index_t::lock in non-shared mode. Acquire the index root page latch in non-shared mode. Possibly acquire further index page latches in accordance with the WL#6326 rules explained in MDEV-29835 . Acquire the tablespace latch. Acquire latches on the allocation data structures. These are protected by the tablespace latch. Possibly allocate and write some pages. Currently, dict_stats_update_transient_for_index() and possibly other functions fail to acquire the index root page latch before the tablespace latch. Swapping the order of acquisition ought to fix this. In the core dump that I analyzed, dict_stats_update_transient_for_index() was holding a tablespace latch and waiting for the index root page latch.

Marko Mäkelä added a comment - 2023-02-13 07:43

It turns out that dict_stats_update_transient_for_index() and other code that I examined is already following the correct latching order as described above. What looks incorrect is that we are only acquiring a shared tablespace latch. This will wrongly allow similar code to be executed for other indexes in the same tablespace. In the core dump of a hang that I analyzed, we have a wait for some metadata allocation page inside fseg_n_reserved_pages(). If all threads that access a tablespace allocation metadata pages are holding a non-shared latch on the tablespace, they would conflict with each other at a higher level and these deadlocks would be impossible.

Marko Mäkelä added a comment - 2023-02-13 07:43 It turns out that dict_stats_update_transient_for_index() and other code that I examined is already following the correct latching order as described above. What looks incorrect is that we are only acquiring a shared tablespace latch. This will wrongly allow similar code to be executed for other indexes in the same tablespace. In the core dump of a hang that I analyzed, we have a wait for some metadata allocation page inside fseg_n_reserved_pages() . If all threads that access a tablespace allocation metadata pages are holding a non-shared latch on the tablespace, they would conflict with each other at a higher level and these deadlocks would be impossible.

Marko Mäkelä added a comment - 2023-02-13 07:58

Ironically, this hang was introduced with the fix of ~~MDEV-29883~~. To fix that bug, we should only have introduced some non-shared dict_index_t::lock acquisition. We should not have downgraded some exclusive fil_space_t::latch acquisition to a shared one in the following functions:

btr_get_size_and_reserved()
dict_stats_update_transient_for_index()
dict_stats_analyze_index()

Marko Mäkelä added a comment - 2023-02-13 07:58 Ironically, this hang was introduced with the fix of MDEV-29883 . To fix that bug, we should only have introduced some non-shared dict_index_t::lock acquisition. We should not have downgraded some exclusive fil_space_t::latch acquisition to a shared one in the following functions: btr_get_size_and_reserved() dict_stats_update_transient_for_index() dict_stats_analyze_index()

Marko Mäkelä added a comment - 2023-02-13 08:19

After this fix, the only invoker of mtr_t::s_lock_space() will be the function fseg_page_is_allocated() (renamed from fseg_page_is_free() in ~~MDEV-13542~~). On the surface, that mini-transaction should be fine, since it is only holding a shared fil_space_t::latch and a shared latch on an allocation bitmap page (n·innodb_page_size). But, if any caller of that function is holding more latches (for example, CHECK TABLE without QUICK is, in btr_validate_level()), we are prone to a deadlock. This possibility of a deadlock was introduced on the merge of ~~MDEV-24569~~ to 10.6. We must replace mtr_t::s_lock_space() with mtr_t::x_lock_space().

Marko Mäkelä added a comment - 2023-02-13 08:19 After this fix, the only invoker of mtr_t::s_lock_space() will be the function fseg_page_is_allocated() (renamed from fseg_page_is_free() in MDEV-13542 ). On the surface, that mini-transaction should be fine, since it is only holding a shared fil_space_t::latch and a shared latch on an allocation bitmap page (n· innodb_page_size ). But, if any caller of that function is holding more latches (for example, CHECK TABLE without QUICK is, in btr_validate_level() ), we are prone to a deadlock. This possibility of a deadlock was introduced on the merge of MDEV-24569 to 10.6. We must replace mtr_t::s_lock_space() with mtr_t::x_lock_space() .

MariaDB Server

Deadlock due to updating InnoDB statistics

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration