Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30638

Deadlock due to updating InnoDB statistics

Details

    Description

      This InnoDB deadlock came up while mleich was testing MDEV-26055 and MDEV-26827. Many threads are blocked waiting on a fil_space_t::latch. The actual deadlock seems to involve buffer page latches and the tablespace latch in at least 3 threads. Two of the threads are attempting to update InnoDB statistics, waiting for a buffer page latch while holding shared fil_space_t::latch:

      #7  0x0000558ac484ce42 in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=4, guess=guess@entry=0x0, mode=mode@entry=16, mtr=mtr@entry=0x7f98ea884950, err=0x0, allow_ibuf_merge=false) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/buf/buf0buf.cc:2952
      #8  0x0000558ac49e2e18 in fseg_inode_try_get (header=header@entry=0x7f9913bb4054 "", space=<optimized out>, zip_size=<optimized out>, mtr=mtr@entry=0x7f98ea884950, block=block@entry=0x7f98ea884740, err=err@entry=0x0) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/buf0types.h:95
      #9  0x0000558ac49e6cf0 in fseg_n_reserved_pages (block=..., header=header@entry=0x7f9913bb4054 "", used=used@entry=0x7f98ea884850, mtr=mtr@entry=0x7f98ea884950) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/buf0types.h:131
      #10 0x0000558ac4976846 in dict_stats_update_transient_for_index (index=index@entry=0x616004011a08) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/fil0fil.h:1072
      

      One thread is waiting for an exclusive tablespace latch while holding several buffer page latches:

      #5  mtr_t::x_lock_space (this=this@entry=0x7f98eb9e1d60, space=space@entry=0x6120000523c0) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/mtr/mtr0mtr.cc:521
      #6  0x0000558ac49eb144 in fsp_reserve_free_extents (n_reserved=n_reserved@entry=0x7f98eb9e1750, space=0x6120000523c0, n_ext=<optimized out>, alloc_type=alloc_type@entry=FSP_NORMAL, mtr=mtr@entry=0x7f98eb9e1d60, n_pages=n_pages@entry=2) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/fsp/fsp0fsp.cc:2398
      #7  0x0000558ac47be2ca in btr_cur_pessimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7f98eb9e1960, offsets=offsets@entry=0x7f98eb9e18e0, heap=heap@entry=0x7f98eb9e18c0, entry=entry@entry=0x616003f5da08, rec=rec@entry=0x7f98eb9e1900, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/dict0mem.h:1216
      

      I see that before MDEV-29883, dict_stats_update_transient_for_index() used to hold an exclusive tablespace latch, instead of a shared one. It seems to me that this deadlock would have bene possible even without that change. Possibly the probability of the deadlock could have been increased.

      I also see that before MDEV-29883, dict_stats_update_transient_for_index() only used to hold a shared index latch. It would seem to me that acquiring an even stronger index latch would prevent the observed deadlock:

      diff --git a/storage/innobase/dict/dict0stats.cc b/storage/innobase/dict/dict0stats.cc
      index 0d630583daa..101424d0f44 100644
      --- a/storage/innobase/dict/dict0stats.cc
      +++ b/storage/innobase/dict/dict0stats.cc
      @@ -1473,7 +1473,7 @@ dict_stats_update_transient_for_index(
       		mtr_t	mtr;
       
       		mtr.start();
      -		mtr_sx_lock_index(index, &mtr);
      +		mtr_x_lock_index(index, &mtr);
       
       		dberr_t err;
       		buf_block_t* root = btr_root_block_get(index, RW_SX_LATCH,
      

      Because a pessimistic insert would start by acquiring an index latch, it would not be able to deadlock with this mini-transaction anymore.

      Attachments

        Issue Links

          Activity

            Sorry, it is not that simple. Acquiring a stronger index latch does not help this deadlock. The INSERT operation is actually holding an index latch in a mode that would conflict with the statistics operation, if the statistics operation were holding a latch on the same index.

            The statistics can be updated for different indexes in the same tablespace than the one that the INSERT is operating on. To prevent this hang, one possible fix could be that dict_stats_update_transient_for_index() and possibly other functions (dict_index_t::clear(), btr_get_size_and_reserved(), dict_stats_save_defrag_stats(), dict_stats_analyze_index()) would acquire a latch on every index of the table, like row_quiesce_set_state() does it, and only then proceeds to acquire a fil_space_t::latch. This approach would still fail to prevent a hang if a table is stored in the InnoDB system tablespace.

            marko Marko Mäkelä added a comment - Sorry, it is not that simple. Acquiring a stronger index latch does not help this deadlock. The INSERT operation is actually holding an index latch in a mode that would conflict with the statistics operation, if the statistics operation were holding a latch on the same index . The statistics can be updated for different indexes in the same tablespace than the one that the INSERT is operating on. To prevent this hang, one possible fix could be that dict_stats_update_transient_for_index() and possibly other functions ( dict_index_t::clear() , btr_get_size_and_reserved() , dict_stats_save_defrag_stats() , dict_stats_analyze_index() ) would acquire a latch on every index of the table, like row_quiesce_set_state() does it, and only then proceeds to acquire a fil_space_t::latch . This approach would still fail to prevent a hang if a table is stored in the InnoDB system tablespace.

            The relevant latching order here should be as follows:

            1. Acquire dict_index_t::lock in non-shared mode.
            2. Acquire the index root page latch in non-shared mode.
            3. Possibly acquire further index page latches in accordance with the WL#6326 rules explained in MDEV-29835.
            4. Acquire the tablespace latch.
            5. Acquire latches on the allocation data structures. These are protected by the tablespace latch.
            6. Possibly allocate and write some pages.

            Currently, dict_stats_update_transient_for_index() and possibly other functions fail to acquire the index root page latch before the tablespace latch. Swapping the order of acquisition ought to fix this. In the core dump that I analyzed, dict_stats_update_transient_for_index() was holding a tablespace latch and waiting for the index root page latch.

            marko Marko Mäkelä added a comment - The relevant latching order here should be as follows: Acquire dict_index_t::lock in non-shared mode. Acquire the index root page latch in non-shared mode. Possibly acquire further index page latches in accordance with the WL#6326 rules explained in MDEV-29835 . Acquire the tablespace latch. Acquire latches on the allocation data structures. These are protected by the tablespace latch. Possibly allocate and write some pages. Currently, dict_stats_update_transient_for_index() and possibly other functions fail to acquire the index root page latch before the tablespace latch. Swapping the order of acquisition ought to fix this. In the core dump that I analyzed, dict_stats_update_transient_for_index() was holding a tablespace latch and waiting for the index root page latch.

            It turns out that dict_stats_update_transient_for_index() and other code that I examined is already following the correct latching order as described above. What looks incorrect is that we are only acquiring a shared tablespace latch. This will wrongly allow similar code to be executed for other indexes in the same tablespace. In the core dump of a hang that I analyzed, we have a wait for some metadata allocation page inside fseg_n_reserved_pages(). If all threads that access a tablespace allocation metadata pages are holding a non-shared latch on the tablespace, they would conflict with each other at a higher level and these deadlocks would be impossible.

            marko Marko Mäkelä added a comment - It turns out that dict_stats_update_transient_for_index() and other code that I examined is already following the correct latching order as described above. What looks incorrect is that we are only acquiring a shared tablespace latch. This will wrongly allow similar code to be executed for other indexes in the same tablespace. In the core dump of a hang that I analyzed, we have a wait for some metadata allocation page inside fseg_n_reserved_pages() . If all threads that access a tablespace allocation metadata pages are holding a non-shared latch on the tablespace, they would conflict with each other at a higher level and these deadlocks would be impossible.

            Ironically, this hang was introduced with the fix of MDEV-29883. To fix that bug, we should only have introduced some non-shared dict_index_t::lock acquisition. We should not have downgraded some exclusive fil_space_t::latch acquisition to a shared one in the following functions:

            • btr_get_size_and_reserved()
            • dict_stats_update_transient_for_index()
            • dict_stats_analyze_index()
            marko Marko Mäkelä added a comment - Ironically, this hang was introduced with the fix of MDEV-29883 . To fix that bug, we should only have introduced some non-shared dict_index_t::lock acquisition. We should not have downgraded some exclusive fil_space_t::latch acquisition to a shared one in the following functions: btr_get_size_and_reserved() dict_stats_update_transient_for_index() dict_stats_analyze_index()

            After this fix, the only invoker of mtr_t::s_lock_space() will be the function fseg_page_is_allocated() (renamed from fseg_page_is_free() in MDEV-13542). On the surface, that mini-transaction should be fine, since it is only holding a shared fil_space_t::latch and a shared latch on an allocation bitmap page (n·innodb_page_size). But, if any caller of that function is holding more latches (for example, CHECK TABLE without QUICK is, in btr_validate_level()), we are prone to a deadlock. This possibility of a deadlock was introduced on the merge of MDEV-24569 to 10.6. We must replace mtr_t::s_lock_space() with mtr_t::x_lock_space().

            marko Marko Mäkelä added a comment - After this fix, the only invoker of mtr_t::s_lock_space() will be the function fseg_page_is_allocated() (renamed from fseg_page_is_free() in MDEV-13542 ). On the surface, that mini-transaction should be fine, since it is only holding a shared fil_space_t::latch and a shared latch on an allocation bitmap page (n· innodb_page_size ). But, if any caller of that function is holding more latches (for example, CHECK TABLE without QUICK is, in btr_validate_level() ), we are prone to a deadlock. This possibility of a deadlock was introduced on the merge of MDEV-24569 to 10.6. We must replace mtr_t::s_lock_space() with mtr_t::x_lock_space() .

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.