Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28454

Latching order violation (and hang) in ibuf_insert_low

    XMLWordPrintable

    Details

      Description

      A hang was found during the stress test of a 10.6-based branch:

      10.6 03e703fd74b43ffd27e858e03e9d147029de086e

      (gdb) thread apply 10 bt
       
      Thread 10 (Thread 0x7f4307d58700 (LWP 3829108)):
      #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
      #1  0x0000564bfa889c5f in srw_mutex_impl<true>::wait (this=0x564bfd654080 <lock_sys+64>, lk=2147483650) at /data/Server/bb-10.6-thiru/storage/innobase/sync/srw_lock.cc:239
      #2  0x0000564bfa8894a9 in srw_mutex_impl<true>::wait_and_lock (this=0x564bfd654080 <lock_sys+64>) at /data/Server/bb-10.6-thiru/storage/innobase/sync/srw_lock.cc:326
      #3  0x0000564bfa47a617 in srw_mutex_impl<true>::wr_lock (this=0x564bfd654080 <lock_sys+64>) at /data/Server/bb-10.6-thiru/storage/innobase/include/srw_lock.h:134
      #4  0x0000564bfa889b62 in ssux_lock_impl<true>::rd_wait (this=0x564bfd654080 <lock_sys+64>) at /data/Server/bb-10.6-thiru/storage/innobase/sync/srw_lock.cc:392
      #5  0x0000564bfa5872d7 in ssux_lock_impl<true>::rd_lock (this=0x564bfd654080 <lock_sys+64>) at /data/Server/bb-10.6-thiru/storage/innobase/include/srw_lock.h:247
      #6  0x0000564bfa5be834 in lock_sys_t::rd_lock (this=0x564bfd654040 <lock_sys>) at /data/Server/bb-10.6-thiru/storage/innobase/include/lock0lock.h:774
      #7  0x0000564bfa58f843 in LockGuard::LockGuard (this=0x7f4307d51df0, hash=@0x564bfd654098: {static LATCH = 1, static ELEMENTS_PER_LATCH = 7, static EMPTY_SLOTS_PER_LATCH = 0, n_cells = 6599, array = 0x630000020400}, id=<error reading variable: Cannot access memory at address 0xfffffffffffffdc0>) at /data/Server/bb-10.6-thiru/storage/innobase/lock/lock0lock.cc:197
      #8  0x0000564bfa5783eb in ibuf_insert_low (mode=36, op=IBUF_OP_INSERT, no_counter=0, entry=0x61600b1ca6d8, entry_size=13, index=0x616007a1bb08, page_id={m_id = 98784247875}, zip_size=0, thr=0x621000b114d8) at /data/Server/bb-10.6-thiru/storage/innobase/ibuf/ibuf0ibuf.cc:3303
      #9  0x0000564bfa579b8a in ibuf_insert (op=IBUF_OP_INSERT, entry=0x61600b1ca6d8, index=0x616007a1bb08, page_id={m_id = 98784247875}, zip_size=0, thr=0x621000b114d8) at /data/Server/bb-10.6-thiru/storage/innobase/ibuf/ibuf0ibuf.cc:3566
      (gdb) thread apply 8 bt
      Thread 8 (Thread 0x7f43060f7700 (LWP 3829133)):
      #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
      #1  0x0000564bfa889c5f in srw_mutex_impl<true>::wait (this=0x7f433bcb1398, lk=2147483650) at /data/Server/bb-10.6-thiru/storage/innobase/sync/srw_lock.cc:239
      #2  0x0000564bfa8894a9 in srw_mutex_impl<true>::wait_and_lock (this=0x7f433bcb1398) at /data/Server/bb-10.6-thiru/storage/innobase/sync/srw_lock.cc:326
      #3  0x0000564bfa47a617 in srw_mutex_impl<true>::wr_lock (this=0x7f433bcb1398) at /data/Server/bb-10.6-thiru/storage/innobase/include/srw_lock.h:134
      #4  0x0000564bfa889b62 in ssux_lock_impl<true>::rd_wait (this=0x7f433bcb1398) at /data/Server/bb-10.6-thiru/storage/innobase/sync/srw_lock.cc:392
      #5  0x0000564bfa5872d7 in ssux_lock_impl<true>::rd_lock (this=0x7f433bcb1398) at /data/Server/bb-10.6-thiru/storage/innobase/include/srw_lock.h:247
      #6  0x0000564bfa65bb9a in sux_lock<ssux_lock_impl<true> >::s_lock (this=0x7f433bcb1398) at /data/Server/bb-10.6-thiru/storage/innobase/include/sux_lock.h:362
      #7  0x0000564bfa651bcd in mtr_t::page_lock (this=0x7f43060ef570, block=0x7f433bcb1380, rw_latch=1) at /data/Server/bb-10.6-thiru/storage/innobase/mtr/mtr0mtr.cc:1240
      #8  0x0000564bfa9dcadf in buf_page_get_low (page_id={m_id = 4}, zip_size=0, rw_latch=1, guess=0x7f433bcb1380, mode=10, mtr=0x7f43060ef570, err=0x7f43060ee150, allow_ibuf_merge=false) at /data/Server/bb-10.6-thiru/storage/innobase/buf/buf0buf.cc:2974
      #9  0x0000564bfa9dd366 in buf_page_get_gen (page_id={m_id = 4}, zip_size=0, rw_latch=1, guess=0x7f433bcb1380, mode=10, mtr=0x7f43060ef570, err=0x7f43060ee150, allow_ibuf_merge=false) at /data/Server/bb-10.6-thiru/storage/innobase/buf/buf0buf.cc:3045
      #10 0x0000564bfa9779f4 in btr_cur_search_to_nth_level_func (index=0x61600003f408, level=0, tuple=0x61600aa1af08, mode=PAGE_CUR_GE, latch_mode=2, cursor=0x7f43060ef400, ahi_latch=0x0, mtr=0x7f43060ef570, autoinc=0) at /data/Server/bb-10.6-thiru/storage/innobase/btr/btr0cur.cc:1602
      #11 0x0000564bfa9a91d8 in btr_pcur_open_low (index=0x61600003f408, level=0, tuple=0x61600aa1af08, mode=PAGE_CUR_GE, latch_mode=2, cursor=0x7f43060ef400, autoinc=0, mtr=0x7f43060ef570) at /data/Server/bb-10.6-thiru/storage/innobase/include/btr0pcur.inl:369
      #12 0x0000564bfa9ad9e4 in btr_pcur_open_on_user_rec (index=0x61600003f408, tuple=0x61600aa1af08, mode=PAGE_CUR_GE, latch_mode=2, cursor=0x7f43060ef400, mtr=0x7f43060ef570) at /data/Server/bb-10.6-thiru/storage/innobase/btr/btr0pcur.cc:660
      #13 0x0000564bfa580d78 in ibuf_delete_for_discarded_space (space=42) at /data/Server/bb-10.6-thiru/storage/innobase/ibuf/ibuf0ibuf.cc:4477
      #14 0x0000564bfaaf3fcd in fil_delete_tablespace (id=42) at /data/Server/bb-10.6-thiru/storage/innobase/fil/fil0fil.cc:1733
      #15 0x0000564bfaadff45 in trx_t::commit (this=0x7f433ca85740, deleted=std::vector of length 0, capacity 0) at /data/Server/bb-10.6-thiru/storage/innobase/dict/drop.cc:271
      #16 0x0000564bfa441bba in ha_innobase::delete_table (this=0x62b000168c38, name=0x7f43060f21f0 "./test/B") at /data/Server/bb-10.6-thiru/storage/innobase/handler/ha_innodb.cc:13717
      

      Thread 8 is holding exclusive lock_sys.latch while waiting for a page latch on the change buffer root page (page 4 in the system tablespace). Thread 10 is waiting for a shared lock_sys.latch while holding a write latch on page 4.

      The acquisition of lock_sys.latch in ibuf_insert_low() is part of a check to see if change buffering is safe to be attempted. We should use a trylock operation there and refuse change buffering if the latches are not available without waiting.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.