Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30638

Deadlock due to updating InnoDB statistics

    XMLWordPrintable

Details

    Description

      This InnoDB deadlock came up while mleich was testing MDEV-26055 and MDEV-26827. Many threads are blocked waiting on a fil_space_t::latch. The actual deadlock seems to involve buffer page latches and the tablespace latch in at least 3 threads. Two of the threads are attempting to update InnoDB statistics, waiting for a buffer page latch while holding shared fil_space_t::latch:

      #7  0x0000558ac484ce42 in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=4, guess=guess@entry=0x0, mode=mode@entry=16, mtr=mtr@entry=0x7f98ea884950, err=0x0, allow_ibuf_merge=false) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/buf/buf0buf.cc:2952
      #8  0x0000558ac49e2e18 in fseg_inode_try_get (header=header@entry=0x7f9913bb4054 "", space=<optimized out>, zip_size=<optimized out>, mtr=mtr@entry=0x7f98ea884950, block=block@entry=0x7f98ea884740, err=err@entry=0x0) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/buf0types.h:95
      #9  0x0000558ac49e6cf0 in fseg_n_reserved_pages (block=..., header=header@entry=0x7f9913bb4054 "", used=used@entry=0x7f98ea884850, mtr=mtr@entry=0x7f98ea884950) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/buf0types.h:131
      #10 0x0000558ac4976846 in dict_stats_update_transient_for_index (index=index@entry=0x616004011a08) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/fil0fil.h:1072
      

      One thread is waiting for an exclusive tablespace latch while holding several buffer page latches:

      #5  mtr_t::x_lock_space (this=this@entry=0x7f98eb9e1d60, space=space@entry=0x6120000523c0) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/mtr/mtr0mtr.cc:521
      #6  0x0000558ac49eb144 in fsp_reserve_free_extents (n_reserved=n_reserved@entry=0x7f98eb9e1750, space=0x6120000523c0, n_ext=<optimized out>, alloc_type=alloc_type@entry=FSP_NORMAL, mtr=mtr@entry=0x7f98eb9e1d60, n_pages=n_pages@entry=2) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/fsp/fsp0fsp.cc:2398
      #7  0x0000558ac47be2ca in btr_cur_pessimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7f98eb9e1960, offsets=offsets@entry=0x7f98eb9e18e0, heap=heap@entry=0x7f98eb9e18c0, entry=entry@entry=0x616003f5da08, rec=rec@entry=0x7f98eb9e1900, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/bb-10.6-MDEV-26055-2-MDEV-26827A/storage/innobase/include/dict0mem.h:1216
      

      I see that before MDEV-29883, dict_stats_update_transient_for_index() used to hold an exclusive tablespace latch, instead of a shared one. It seems to me that this deadlock would have bene possible even without that change. Possibly the probability of the deadlock could have been increased.

      I also see that before MDEV-29883, dict_stats_update_transient_for_index() only used to hold a shared index latch. It would seem to me that acquiring an even stronger index latch would prevent the observed deadlock:

      diff --git a/storage/innobase/dict/dict0stats.cc b/storage/innobase/dict/dict0stats.cc
      index 0d630583daa..101424d0f44 100644
      --- a/storage/innobase/dict/dict0stats.cc
      +++ b/storage/innobase/dict/dict0stats.cc
      @@ -1473,7 +1473,7 @@ dict_stats_update_transient_for_index(
       		mtr_t	mtr;
       
       		mtr.start();
      -		mtr_sx_lock_index(index, &mtr);
      +		mtr_x_lock_index(index, &mtr);
       
       		dberr_t err;
       		buf_block_t* root = btr_root_block_get(index, RW_SX_LATCH,
      

      Because a pessimistic insert would start by acquiring an index latch, it would not be able to deadlock with this mini-transaction anymore.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.