Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33431

Latching order violation reported fil_system.sys_space.latch and ibuf_pessimistic_insert_mutex

    XMLWordPrintable

Details

    Description

      While testing MDEV-33383, mleich provided an rr replay trace for a lock order inversion in ibuf_remove_free_page(), which was invoked by ibuf_free_excess_pages() and doing the following:

      10.5 1d74927c58cea438c135a95886a9224405819a96 ibuf0ibuf.cc

      1913
      	log_free_check();
      1914
       
      1915
      	mtr_start(&mtr);
      1916
      	/* Acquire the fsp latch before the ibuf header, obeying the latching
      1917
      	order */
      1918
       
      1919
      	mtr_x_lock_space(fil_system.sys_space, &mtr);
      1920
      	header_page = ibuf_header_page_get(&mtr);
      1921
       
      1922
      	/* Prevent pessimistic inserts to insert buffer trees for a while */
      1923
      	ibuf_enter(&mtr);
      1924
      	mutex_enter(&ibuf_pessimistic_insert_mutex);
      

      At the start of this code, the thread is not holding any latches. The complaint is about lock order inversion between ibuf_pessimistic_insert_mutex and fil_system.sys_space.latch. Starting with 10.6 there will be no such warning, because the debug code was removed as part of MDEV-24142 and MDEV-21452. As far as I can tell, the function ibuf_insert_low() will acquire ibuf_pessimistic_insert_mutex first, and then descend into the tree (btr_pcur_open() for performing the multi-page insert.

      However, because this is a change buffer operation, it looks like any new pages would have been preallocated, to be delivered by btr_page_alloc_for_ibuf(). Normal page allocation would acquire fil_system.sys_space.latch, which would be a genuine lock order inversion.

      The crash report looked like this:

      10.5-MDEV-33383 cbb55ca3ce41d71907a9e0ee66767e829416925e

      #5  0x000055abe43f1aeb in ut_dbg_assertion_failed (
          expr=expr@entry=0x55abe49d8df0 "latches->empty() || level == SYNC_LEVEL_VARYING || level == SYNC_NO_ORDER_CHECK || latches->back().get_level() == SYNC_NO_ORDER_CHECK || latches->back().m_latch->get_level() == SYNC_LEVEL_VARYING || l"..., file=file@entry=0x55abe49d80e0 "/data/Server/10.5-MDEV-33383/storage/innobase/sync/sync0debug.cc", line=line@entry=215) at /data/Server/10.5-MDEV-33383/storage/innobase/ut/ut0dbg.cc:60
      #6  0x000055abe43aad95 in LatchDebug::lock_validate (this=this@entry=0x55abe66dcc00, latch=latch@entry=0x7f70c00619b0, level=SYNC_IBUF_PESS_INSERT_MUTEX) at /data/Server/10.5-MDEV-33383/storage/innobase/sync/sync0debug.cc:215
      #7  0x000055abe43a781d in sync_check_lock_validate (latch=latch@entry=0x7f70c00619b0) at /data/Server/10.5-MDEV-33383/storage/innobase/sync/sync0debug.cc:1038
      #8  0x000055abe41acfb1 in MutexDebug<TTASEventMutex<GenericPolicy> >::enter (this=this@entry=0x55abe591f830 <ibuf_pessimistic_insert_mutex+16>, mutex=...,
          filename=filename@entry=0x55abe499c290 "/data/Server/10.5-MDEV-33383/storage/innobase/ibuf/ibuf0ibuf.cc", line=line@entry=1924) at /data/Server/10.5-MDEV-33383/storage/innobase/include/sync0types.h:383
      #9  0x000055abe41ad6fe in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x55abe591f820 <ibuf_pessimistic_insert_mutex>, n_spins=30, n_delay=4,
          name=0x55abe499c290 "/data/Server/10.5-MDEV-33383/storage/innobase/ibuf/ibuf0ibuf.cc", line=1924) at /data/Server/10.5-MDEV-33383/storage/innobase/include/ib0mutex.h:588
      #10 0x000055abe420ea50 in ibuf_remove_free_page () at /data/Server/10.5-MDEV-33383/storage/innobase/ibuf/ibuf0ibuf.cc:1924
      #11 0x000055abe420f445 in ibuf_free_excess_pages () at /data/Server/10.5-MDEV-33383/storage/innobase/ibuf/ibuf0ibuf.cc:2026
      #12 0x000055abe42f6850 in row_ins_sec_index_entry (index=index@entry=0x7f70ac0bfce0, entry=entry@entry=0x7f70ac066578, thr=thr@entry=0x7f70ac0c4820, check_foreign=check_foreign@entry=true)
          at /data/Server/10.5-MDEV-33383/storage/innobase/row/row0ins.cc:3374
      #13 0x000055abe42f6982 in row_ins_index_entry (index=0x7f70ac0bfce0, entry=0x7f70ac066578, thr=thr@entry=0x7f70ac0c4820) at /data/Server/10.5-MDEV-33383/storage/innobase/row/row0ins.cc:3414
      #14 0x000055abe42f6ae5 in row_ins_index_entry_step (node=node@entry=0x7f70ac0c44f8, thr=thr@entry=0x7f70ac0c4820) at /data/Server/10.5-MDEV-33383/storage/innobase/row/row0ins.cc:3580
      #15 0x000055abe42f6bd5 in row_ins (node=node@entry=0x7f70ac0c44f8, thr=thr@entry=0x7f70ac0c4820) at /data/Server/10.5-MDEV-33383/storage/innobase/row/row0ins.cc:3717
      #16 0x000055abe42f76f5 in row_ins_step (thr=thr@entry=0x7f70ac0c4820) at /data/Server/10.5-MDEV-33383/storage/innobase/row/row0ins.cc:3860
      #17 0x000055abe431a539 in row_insert_for_mysql (mysql_rec=mysql_rec@entry=0x7f70ac0c0318 "\377ilfq", ' ' <repeats 56 times>, "\021)", prebuilt=0x7f70ac0c3f70, ins_mode=<optimized out>)
          at /data/Server/10.5-MDEV-33383/storage/innobase/row/row0mysql.cc:1394
      #18 0x000055abe4199038 in ha_innobase::write_row (this=0x7f70ac0c3730, record=0x7f70ac0c0318 "\377ilfq", ' ' <repeats 56 times>, "\021)") at /data/Server/10.5-MDEV-33383/storage/innobase/handler/ha_innodb.cc:7685
      

      To me, this would seem to be a bogus report about a potential hang.

      The change buffer was disabled by default in MDEV-27734.

      Attachments

        Issue Links

          Activity

            People

              debarun Debarun Banerjee
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.