Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26781

InnoDB hangs on SPATIAL INDEX when using SUX_LOCK_GENERIC

    XMLWordPrintable

    Details

      Description

      The test innodb_gis.rtree_purge would easily hang when using the fallback SUX_LOCK_GENERIC implementation, which is the only option for operating systems for which we lack a futex-like interface:

      cmake -DCMAKE_CXX_FLAGS=-DSUX_LOCK_GENERIC .
      cmake --build .
      ./mtr --parallel=auto --repeat=10 innodb_gis.rtree_purge
      

      This appears to be a deadlock involving DML and purge:

      #9  mtr_t::x_lock (this=0x7f4e5212b720, file=0x55d1d4ee3728 "/mariadb/10.6/storage/innobase/btr/btr0cur.cc", line=1461, lock=0x7f4de40f7808) at /mariadb/10.6/storage/innobase/include/mtr0mtr.h:240
      #10 0x000055d1d49e7e9d in btr_cur_search_to_nth_level_func (index=index@entry=0x7f4de40f7698, level=level@entry=0, tuple=tuple@entry=0x7f4de41a19b8, mode=mode@entry=PAGE_CUR_RTREE_INSERT, latch_mode=<optimized out>, latch_mode@entry=33, cursor=cursor@entry=0x7f4e5212b420, ahi_latch=0x0, mtr=0x7f4e5212b720, autoinc=0) at /mariadb/10.6/storage/innobase/btr/btr0cur.cc:1461
      #33 0x000055d1d40d1aad in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f4de4000d48, packet=0x7f4e5212d400 "\377\377\377\377", packet@entry=0x7f4de4107f59 "insert into t select @p,@p from seq_1_to_130", packet_length=packet_length@entry=44, blocking=blocking@entry=true) at /mariadb/10.6/sql/sql_parse.cc:1896
      

      At the same time, a purge task is attempting to acquire a shared latch on the page:

      #9  0x000055d1d49e7ae1 in mtr_t::s_lock (lock=0x7f4de40f7808, line=1505, file=0x55d1d4ee3728 "/mariadb/10.6/storage/innobase/btr/btr0cur.cc", this=0x7f4e427fb170) at /mariadb/10.6/storage/innobase/include/mtr0mtr.h:229
      #10 btr_cur_search_to_nth_level_func (index=index@entry=0x7f4de40f7698, level=level@entry=0, tuple=tuple@entry=0x7f4df8004b18, mode=mode@entry=PAGE_CUR_RTREE_LOCATE, latch_mode=<optimized out>, latch_mode@entry=2, cursor=cursor@entry=0x7f4e427faee0, ahi_latch=0x0, mtr=0x7f4e427fb170, autoinc=0) at /mariadb/10.6/storage/innobase/btr/btr0cur.cc:1505
      #11 0x000055d1d4b045a4 in rtr_pcur_open (index=index@entry=0x7f4de40f7698, tuple=tuple@entry=0x7f4df8004b18, mode=mode@entry=PAGE_CUR_RTREE_LOCATE, latch_mode=latch_mode@entry=2, cursor=cursor@entry=0x7f4e427faee0, mtr=mtr@entry=0x7f4e427fb170) at /mariadb/10.6/storage/innobase/gis/gis0sea.cc:574
      #12 0x000055d1d491e118 in row_search_index_entry (index=index@entry=0x7f4de40f7698, entry=entry@entry=0x7f4df8004b18, mode=mode@entry=2, pcur=pcur@entry=0x7f4e427faee0, mtr=mtr@entry=0x7f4e427fb170) at /mariadb/10.6/storage/innobase/row/row0row.cc:1300
      #13 0x000055d1d490de0c in row_purge_remove_sec_if_poss_leaf (node=node@entry=0x55d1d693c068, index=index@entry=0x7f4de40f7698, entry=entry@entry=0x7f4df8004b18) at /mariadb/10.6/storage/innobase/row/row0purge.cc:524
      

      On the futex-based implementation this works fine. That is, MariaDB running on Linux, OpenBSD, and Microsoft Windows should not be affected by this.

      An easy fix could be to compose the ssux_lock out of 2 std::atomic fields (like in MDEV-25404) also when using SUX_LOCK_GENERIC.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:

                  Git Integration