Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33543

Server hang caused by InnoDB change buffer

Details

    Description

      The problem was found on
         origin/10.6-MDEV-33508 208f3ee34381f7c8a79d114f610c42bb53a5f394 2024-02-21T11:35:06+02:00
      Per Marko the official 10.6 would suffer from that too.
       
      Scenario:
      1. Start the server and generate some initial data.
      2. 33 "worker" run concurrent some DML mix
      3. Some RQG component checks periodic the processlist.
      pluto:/data/results/1708610110/HANG1/rqg.log
      Line 1322 (2024-02-22T16:08:28)  9 workers active, the other have disconnected
             They all run some INSERT IGNORE INTO ... (one row only).
      Line 1797 (2024-02-22T16:12:39) 9 workers active
             They all run the same statement like above.
             Only the time was going up by ~ 250s.
       
      State  somewhere between 16:12:39 and 16:16 
      pluto:/data/results/1708610110/HANG1$ gdb -c ./1/data/gcore.2353714 /data/Server_bin/10.6-MDEV-33508_debug_Og/bin/mariadbd
      State  somewhere between 16:16 and 16:19
      pluto:/data/results/1708610110/HANG1$ gdb -c ./1/data/core.2353714 /data/Server_bin/10.6-MDEV-33508_debug_Og/bin/mariadbd
       
      RQG
      ===
      # git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG
      #
      # GIT_SHOW: HEAD -> master, origin/master, origin/HEAD 24efdf20d844da7e26bf02b494515a503ef8e474 2024-01-18T19:39:19+01:00
      # rqg.pl  : Version 4.4.0 (2023-08)
      #
      # $RQG_HOME/rqg.pl \
      # --gendata=conf/mariadb/oltp.zz \
      # --max_gd_duration=900 \
      # --grammar=conf/mariadb/oltp.yy \
      # --mysqld=--transaction-isolation=REPEATABLE-READ \
      # --validator=SelectStability \
      # --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
      # --mysqld=--loose-idle_write_transaction_timeout=0 \
      # --mysqld=--loose-idle_transaction_timeout=0 \
      # --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      # --mysqld=--connect_timeout=60 \
      # --mysqld=--interactive_timeout=28800 \
      # --mysqld=--slave_net_timeout=60 \
      # --mysqld=--net_read_timeout=30 \
      # --mysqld=--net_write_timeout=60 \
      # --mysqld=--loose-table_lock_wait_timeout=50 \
      # --mysqld=--wait_timeout=28800 \
      # --no_mask \
      # --queries=10000000 \
      # --seed=random \
      # --reporters=None \
      # --reporters=ErrorLog \
      # --reporters=Deadlock \
      # --validators=None \
      # --mysqld=--log_output=none \
      # --mysqld=--log_bin_trust_function_creators=1 \
      # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      # --engine=InnoDB \
      # --restart_timeout=240 \
      # --mysqld=--plugin-load-add=file_key_management.so \
      # --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
      # --mysqld=--plugin-load-add=provider_lzo.so \
      # --mysqld=--plugin-load-add=provider_bzip2.so \
      # --mysqld=--plugin-load-add=provider_lzma.so \
      # --mysqld=--plugin-load-add=provider_snappy.so \
      # --mysqld=--plugin-load-add=provider_lz4.so \
      # --mysqld=--loose-innodb_compression_level=1 \
      # --duration=300 \
      # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      # --mysqld=--lock-wait-timeout=86400 \
      # --mysqld=--innodb-lock-wait-timeout=50 \    
      # --mysqld=--sql_mode=STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION \
      # --mysqld=--loose-innodb_read_only_compressed=OFF \
      # --mysqld=--innodb_stats_persistent=off \
      # --mysqld=--innodb_adaptive_hash_index=off \
      # --mysqld=--innodb_random_read_ahead=OFF \
      # --mysqld=--log-bin \
      # --mysqld=--sync-binlog=1 \
      # --mysqld=--loose-innodb_evict_tables_on_commit_debug=on \
      # --threads=33 \
      # --mysqld=--innodb_use_native_aio=1 \
      # --mysqld=--innodb_undo_log_truncate=OFF \
      # --mysqld=--loose_innodb_change_buffering=all \     # <============
      # --mysqld=--innodb_rollback_on_timeout=OFF \
      # --mysqld=--innodb_page_size=16K \
      # --mysqld=--innodb-buffer-pool-size=10M \
      # --vardir_type=slow \
      # <local settings>
      
      

      Attachments

        Issue Links

          Activity

            There is a deadlock between two threads that are buffering an insert at the same time:

            ssh pluto
            gdb -c /data/results/1708610110/HANG1/1/data/gcore.2353714 /data/Server_bin/10.6-MDEV-33508_debug_Og/bin/mariadbd
            

            10.6-MDEV-33508 208f3ee34381f7c8a79d114f610c42bb53a5f394

            Thread 12 (Thread 0x7f4cc99ed640 (LWP 2363768)):
            …
            #6  sux_lock<ssux_lock_impl<true> >::s_lock (this=this@entry=0x7f4cdc564a68) at /data/Server/10.6-MDEV-33508/storage/innobase/include/sux_lock.h:362
            #7  0x00005555dd5575f2 in buf_page_get_low (page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=8, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc99ea070, err=0x7f4cc99e99ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2577
            #8  0x00005555dd559c06 in buf_page_get_gen (page_id=page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=8, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc99ea070, err=0x7f4cc99e99ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2927
            #9  0x00005555dd52a694 in btr_latch_prev (block=block@entry=0x7f4cdc55f430, page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=RW_X_LATCH, mtr=mtr@entry=0x7f4cc99ea070, err=err@entry=0x7f4cc99e99ec) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:959
            #10 0x00005555dd52d03a in btr_cur_t::search_leaf (this=this@entry=0x7f4cc99e9f90, tuple=tuple@entry=0x7f4c9404f7b0, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=BTR_MODIFY_PREV, mtr=mtr@entry=0x7f4cc99ea070) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:1375
            #11 0x00005555dd331f5a in btr_pcur_open (mtr=0x7f4cc99ea070, cursor=0x7f4cc99e9f90, latch_mode=BTR_MODIFY_PREV, mode=PAGE_CUR_LE, tuple=0x7f4c9404f7b0) at /data/Server/10.6-MDEV-33508/storage/innobase/include/btr0pcur.h:431
            #12 ibuf_insert_low (mode=mode@entry=BTR_MODIFY_PREV, op=op@entry=IBUF_OP_INSERT, no_counter=no_counter@entry=0, entry=entry@entry=0x7f4c8c01a5a8, entry_size=entry_size@entry=13, index=index@entry=0x7f4ca4084070, page_id=..., zip_size=0, thr=0x7f4c8801d038) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3195
            …
            Thread 27 (Thread 0x7f4cc9588640 (LWP 2363877)):
            …
            #5  sux_lock<ssux_lock_impl<true> >::x_lock_upgraded (this=0x7f4cdc55f448) at /data/Server/10.6-MDEV-33508/storage/innobase/include/sux_lock.h:428
            #6  buf_page_get_low (page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=<optimized out>, guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc9585070, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2882
            #7  0x00005555dd559c06 in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=2, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc9585070, err=0x7f4cc95849ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2927
            #8  0x00005555dd4ee074 in btr_block_get (index=..., page=387, mode=mode@entry=RW_X_LATCH, merge=merge@entry=false, mtr=mtr@entry=0x7f4cc9585070, err=err@entry=0x7f4cc95849ec, first=0x0) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0btr.cc:230
            #9  0x00005555dd52ca7e in btr_cur_t::search_leaf (this=this@entry=0x7f4cc9584f90, tuple=tuple@entry=0x7f4c50050f20, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=BTR_MODIFY_PREV, mtr=mtr@entry=0x7f4cc9585070) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:1378
            #10 0x00005555dd331f5a in btr_pcur_open (mtr=0x7f4cc9585070, cursor=0x7f4cc9584f90, latch_mode=BTR_MODIFY_PREV, mode=PAGE_CUR_LE, tuple=0x7f4c50050f20) at /data/Server/10.6-MDEV-33508/storage/innobase/include/btr0pcur.h:431
            #11 ibuf_insert_low (mode=mode@entry=BTR_MODIFY_PREV, op=op@entry=IBUF_OP_INSERT, no_counter=no_counter@entry=0, entry=entry@entry=0x7f4c340242a8, entry_size=entry_size@entry=13, index=index@entry=0x7f4ca407e4a0, page_id=..., zip_size=0, thr=0x7f4c340358f8) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3195
            #12 0x00005555dd3336f1 in ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x7f4c340242a8, index=0x7f4ca407e4a0, page_id=page_id@entry=..., zip_size=zip_size@entry=0, thr=0x7f4c340358f8) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3520
            

            Thread 12 is waiting for a latch on the change buffer page 0x182 in the system tablespace, which is held by Thread 27, which in turn is waiting for a latch on page 0x183, which is held by Thread 12.

            marko Marko Mäkelä added a comment - There is a deadlock between two threads that are buffering an insert at the same time: ssh pluto gdb -c /data/results/1708610110/HANG1/1/data/gcore.2353714 /data/Server_bin/10.6-MDEV-33508_debug_Og/bin/mariadbd 10.6-MDEV-33508 208f3ee34381f7c8a79d114f610c42bb53a5f394 Thread 12 (Thread 0x7f4cc99ed640 (LWP 2363768)): … #6 sux_lock<ssux_lock_impl<true> >::s_lock (this=this@entry=0x7f4cdc564a68) at /data/Server/10.6-MDEV-33508/storage/innobase/include/sux_lock.h:362 #7 0x00005555dd5575f2 in buf_page_get_low (page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=8, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc99ea070, err=0x7f4cc99e99ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2577 #8 0x00005555dd559c06 in buf_page_get_gen (page_id=page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=8, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc99ea070, err=0x7f4cc99e99ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2927 #9 0x00005555dd52a694 in btr_latch_prev (block=block@entry=0x7f4cdc55f430, page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=RW_X_LATCH, mtr=mtr@entry=0x7f4cc99ea070, err=err@entry=0x7f4cc99e99ec) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:959 #10 0x00005555dd52d03a in btr_cur_t::search_leaf (this=this@entry=0x7f4cc99e9f90, tuple=tuple@entry=0x7f4c9404f7b0, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=BTR_MODIFY_PREV, mtr=mtr@entry=0x7f4cc99ea070) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:1375 #11 0x00005555dd331f5a in btr_pcur_open (mtr=0x7f4cc99ea070, cursor=0x7f4cc99e9f90, latch_mode=BTR_MODIFY_PREV, mode=PAGE_CUR_LE, tuple=0x7f4c9404f7b0) at /data/Server/10.6-MDEV-33508/storage/innobase/include/btr0pcur.h:431 #12 ibuf_insert_low (mode=mode@entry=BTR_MODIFY_PREV, op=op@entry=IBUF_OP_INSERT, no_counter=no_counter@entry=0, entry=entry@entry=0x7f4c8c01a5a8, entry_size=entry_size@entry=13, index=index@entry=0x7f4ca4084070, page_id=..., zip_size=0, thr=0x7f4c8801d038) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3195 … Thread 27 (Thread 0x7f4cc9588640 (LWP 2363877)): … #5 sux_lock<ssux_lock_impl<true> >::x_lock_upgraded (this=0x7f4cdc55f448) at /data/Server/10.6-MDEV-33508/storage/innobase/include/sux_lock.h:428 #6 buf_page_get_low (page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=<optimized out>, guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc9585070, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2882 #7 0x00005555dd559c06 in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=2, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc9585070, err=0x7f4cc95849ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2927 #8 0x00005555dd4ee074 in btr_block_get (index=..., page=387, mode=mode@entry=RW_X_LATCH, merge=merge@entry=false, mtr=mtr@entry=0x7f4cc9585070, err=err@entry=0x7f4cc95849ec, first=0x0) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0btr.cc:230 #9 0x00005555dd52ca7e in btr_cur_t::search_leaf (this=this@entry=0x7f4cc9584f90, tuple=tuple@entry=0x7f4c50050f20, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=BTR_MODIFY_PREV, mtr=mtr@entry=0x7f4cc9585070) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:1378 #10 0x00005555dd331f5a in btr_pcur_open (mtr=0x7f4cc9585070, cursor=0x7f4cc9584f90, latch_mode=BTR_MODIFY_PREV, mode=PAGE_CUR_LE, tuple=0x7f4c50050f20) at /data/Server/10.6-MDEV-33508/storage/innobase/include/btr0pcur.h:431 #11 ibuf_insert_low (mode=mode@entry=BTR_MODIFY_PREV, op=op@entry=IBUF_OP_INSERT, no_counter=no_counter@entry=0, entry=entry@entry=0x7f4c340242a8, entry_size=entry_size@entry=13, index=index@entry=0x7f4ca407e4a0, page_id=..., zip_size=0, thr=0x7f4c340358f8) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3195 #12 0x00005555dd3336f1 in ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x7f4c340242a8, index=0x7f4ca407e4a0, page_id=page_id@entry=..., zip_size=zip_size@entry=0, thr=0x7f4c340358f8) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3520 Thread 12 is waiting for a latch on the change buffer page 0x182 in the system tablespace, which is held by Thread 27, which in turn is waiting for a latch on page 0x183, which is held by Thread 12.

            I could analyze the issue and get to the root cause.

            When getting a page (buf_page_get_gen) with no latch option (RW_NO_LATCH), the caller is not expected to follow the B-tree latching order. However in buf_page_get_low we try to acquire shared page latch unconditionally to wait for a page that is being loaded by another thread concurrently. In general it could lead to latch order violation and deadlock.

            Currently it affects the change buffer insert path btr_latch_prev() which tries to load the previous page out of order with RW_NO_LATCH and two concurrent inserts into IBUF tree cause deadlock. This problem is introduced in 10.6 by MDEV-27058.

            The scenario can be precisely repeated by an mtr test.

            1. Have a IBUF tree with 2 leaf pages. left (L) and right (R).

            2. Evict the pages from buffer pool.

            3. Have two insert statements.

            • T1: Inserting into the left page L of IBUF tree
            • T2: Inserting into the right page R of IBUF tree

            4. Block T1 after fixing the left page for read without completing the IO.

            • L is now READ fixed and X latched by T1.

            5. Let T2 latch the right page and try to load the left page out of order with RW_NO_LATCH.

            • R is now X latched T2
            • T2 now waits for S latch on L

            6. Now unblock T1. T1 tries to lock the right page R in B-tree order.

            • T1 now waits for X latch on R

            It completes the deadlock chain T1 -> T2 ->T1

            I have the precise mtr test that repeats the issue. The test has several dependencies on concurrent background activity and IBUF tree structure and hence would be very hard to include in out regular test suite and maintain. I am attaching the test here for any developer to try a run and understand the issue.

            debarun Debarun Banerjee added a comment - I could analyze the issue and get to the root cause. When getting a page (buf_page_get_gen) with no latch option (RW_NO_LATCH), the caller is not expected to follow the B-tree latching order. However in buf_page_get_low we try to acquire shared page latch unconditionally to wait for a page that is being loaded by another thread concurrently. In general it could lead to latch order violation and deadlock. Currently it affects the change buffer insert path btr_latch_prev() which tries to load the previous page out of order with RW_NO_LATCH and two concurrent inserts into IBUF tree cause deadlock. This problem is introduced in 10.6 by MDEV-27058 . The scenario can be precisely repeated by an mtr test. 1. Have a IBUF tree with 2 leaf pages. left (L) and right (R). 2. Evict the pages from buffer pool. 3. Have two insert statements. T1: Inserting into the left page L of IBUF tree T2: Inserting into the right page R of IBUF tree 4. Block T1 after fixing the left page for read without completing the IO. L is now READ fixed and X latched by T1. 5. Let T2 latch the right page and try to load the left page out of order with RW_NO_LATCH. R is now X latched T2 T2 now waits for S latch on L 6. Now unblock T1. T1 tries to lock the right page R in B-tree order. T1 now waits for X latch on R It completes the deadlock chain T1 -> T2 ->T1 I have the precise mtr test that repeats the issue. The test has several dependencies on concurrent background activity and IBUF tree structure and hence would be very hard to include in out regular test suite and maintain. I am attaching the test here for any developer to try a run and understand the issue.
            debarun Debarun Banerjee added a comment - Please review. https://github.com/MariaDB/server/pull/3234

            The change that caused this problem was in in 10.6.6 by https://jira.mariadb.org/browse/MDEV-27058

            monty Michael Widenius added a comment - The change that caused this problem was in in 10.6.6 by https://jira.mariadb.org/browse/MDEV-27058

            People

              debarun Debarun Banerjee
              mleich Matthias Leich
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.