Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19022

InnoDB fails to clean up useless B-tree pages

Details

    Description

      The test case of MDEV-14126 demonstrates that InnoDB can end up with an index tree where a non-leaf page has only one child page. Such internal nodes are totally unused and should be removed when the B-tree is being shrunk in the first place. The functions btr_discard_only_page_on_level(), btr_lift_page_up(), or btr_compress() or their callers may have to be revised.

      Attachments

        Issue Links

          Activity

            The following patch seems to demonstrate the problem:

            diff --git a/storage/innobase/include/page0page.ic b/storage/innobase/include/page0page.ic
            index 224e5b08985..8cf93c0e7be 100644
            --- a/storage/innobase/include/page0page.ic
            +++ b/storage/innobase/include/page0page.ic
            @@ -174,6 +174,9 @@ page_header_set_field(
             	ut_ad(page);
             	ut_ad(field <= PAGE_N_RECS);
             	ut_ad(field != PAGE_N_RECS || val);
            +	ut_ad(field != PAGE_N_RECS || val > 1
            +	      || !mach_read_from_2(PAGE_N_RECS + PAGE_HEADER + page)
            +	      || page_is_leaf(page));
             	ut_ad(field == PAGE_N_HEAP || val < srv_page_size);
             	ut_ad(field != PAGE_N_HEAP || (val & 0x7fff) < srv_page_size);
             
            

            According to the stack trace, we are performing a pessimistic delete when rolling back an insert of a row that caused a duplicate key error, while executing a long statement

            INSERT /*! IGNORE */ INTO t1 VALUES  \n('2016', 'butterfly', 'domestic', NULL, 'd', LOAD_FILE('/mariadb/earth15kb.jpg'), …
            

            10.3

            #4  0x00005555561fb123 in page_header_set_field (page=0x7ffff0838000 "", 
                page_zip=0x7ffff0324750, field=16, val=1)
                at /mariadb/10.3/storage/innobase/include/page0page.ic:177
            #5  0x0000555556207234 in page_zip_dir_delete (page_zip=0x7ffff0324750, 
                rec=0x7ffff083809a "\200", index=0x7fff8c0729d8, offsets=0x7fff8d6323b8, 
                free=0x0) at /mariadb/10.3/storage/innobase/page/page0zip.cc:4531
            #6  0x00005555561e51da in page_mem_free (page=0x7ffff0838000 "", 
                page_zip=0x7ffff0324750, rec=0x7ffff083809a "\200", index=0x7fff8c0729d8, 
                offsets=0x7fff8d6323b8)
                at /mariadb/10.3/storage/innobase/include/page0page.ic:1048
            #7  0x00005555561ec4f0 in page_cur_delete_rec (cursor=0x7ffff00502f8, 
                index=0x7fff8c0729d8, offsets=0x7fff8d6323b8, mtr=0x7ffff0050920)
                at /mariadb/10.3/storage/innobase/page/page0cur.cc:2421
            #8  0x0000555556375b86 in btr_cur_pessimistic_delete (err=0x7ffff00502e4, 
                has_reserved_extents=1, cursor=0x7ffff00502f0, flags=16, rollback=false, 
                mtr=0x7ffff0050920) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:5833
            #9  0x000055555634badc in btr_node_ptr_delete (index=0x7fff8c0729d8, 
                block=0x7ffff0331fe0, mtr=0x7ffff0050920)
                at /mariadb/10.3/storage/innobase/btr/btr0btr.cc:3490
            #10 0x000055555634ef72 in btr_discard_page (cursor=0x7fff8cebc3d8, 
                mtr=0x7ffff0050920) at /mariadb/10.3/storage/innobase/btr/btr0btr.cc:4429
            #11 0x00005555563757fc in btr_cur_pessimistic_delete (err=0x7ffff00508b8, 
                has_reserved_extents=0, cursor=0x7fff8cebc3d8, flags=0, rollback=true, 
                mtr=0x7ffff0050920) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:5749
            #12 0x00005555564b05e4 in row_undo_ins_remove_clust_rec (node=0x7fff8cebc368)
                at /mariadb/10.3/storage/innobase/row/row0uins.cc:210
            #13 0x00005555564b17a8 in row_undo_ins (node=0x7fff8cebc368, 
                thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/row/row0uins.cc:612
            #14 0x00005555562bd832 in row_undo (node=0x7fff8cebc368, thr=0x7fff8c07b498)
                at /mariadb/10.3/storage/innobase/row/row0undo.cc:301
            #15 0x00005555562bda94 in row_undo_step (thr=0x7fff8c07b498)
                at /mariadb/10.3/storage/innobase/row/row0undo.cc:360
            #16 0x0000555556214543 in que_thr_step (thr=0x7fff8c07b498)
                at /mariadb/10.3/storage/innobase/que/que0que.cc:1040
            #17 0x0000555556214797 in que_run_threads_low (thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/que/que0que.cc:1104
            #18 0x0000555556214989 in que_run_threads (thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/que/que0que.cc:1144
            #19 0x000055555631b550 in trx_rollback_to_savepoint_low (trx=0x7ffff0b4b0f0, savept=0x7ffff0051530) at /mariadb/10.3/storage/innobase/trx/trx0roll.cc:141
            #20 0x000055555631b905 in trx_rollback_to_savepoint (trx=0x7ffff0b4b0f0, savept=0x7ffff0051530) at /mariadb/10.3/storage/innobase/trx/trx0roll.cc:189
            #21 0x000055555626b2e5 in row_mysql_handle_errors (new_err=0x7ffff0051528, trx=0x7ffff0b4b0f0, thr=0x7fff8c2efdb0, savept=0x7ffff0051530) at /mariadb/10.3/storage/innobase/row/row0mysql.cc:735
            #22 0x000055555626d1a1 in row_insert_for_mysql (mysql_rec=0x7fff8c047828 "\b", prebuilt=0x7fff8c2edb18, ins_mode=ROW_INS_NORMAL) at /mariadb/10.3/storage/innobase/row/row0mysql.cc:1473
            #23 0x0000555556116534 in ha_innobase::write_row (this=0x7fff8c04e6f0, record=0x7fff8c047828 "\b") at /mariadb/10.3/storage/innobase/handler/ha_innodb.cc:8086
            

            An examination of the code suggests that btr_cur_pessimistic_delete() is not going to shrink the tree, although we should:

            			btr_discard_page(cursor, mtr);
             
            			ret = TRUE;
            			goto return_after_reservations;
            

            Ultimately, the mini-transaction will be committed in row_undo_ins_remove_clust_rec(), leaving the index tree in a corrupted state (with an internal page that does not have multiple child pages).

            marko Marko Mäkelä added a comment - The following patch seems to demonstrate the problem: diff --git a/storage/innobase/include/page0page.ic b/storage/innobase/include/page0page.ic index 224e5b08985..8cf93c0e7be 100644 --- a/storage/innobase/include/page0page.ic +++ b/storage/innobase/include/page0page.ic @@ -174,6 +174,9 @@ page_header_set_field( ut_ad(page); ut_ad(field <= PAGE_N_RECS); ut_ad(field != PAGE_N_RECS || val); + ut_ad(field != PAGE_N_RECS || val > 1 + || !mach_read_from_2(PAGE_N_RECS + PAGE_HEADER + page) + || page_is_leaf(page)); ut_ad(field == PAGE_N_HEAP || val < srv_page_size); ut_ad(field != PAGE_N_HEAP || (val & 0x7fff) < srv_page_size); According to the stack trace, we are performing a pessimistic delete when rolling back an insert of a row that caused a duplicate key error, while executing a long statement INSERT /*! IGNORE */ INTO t1 VALUES \n( '2016' , 'butterfly' , 'domestic' , NULL , 'd' , LOAD_FILE( '/mariadb/earth15kb.jpg' ), … 10.3 #4 0x00005555561fb123 in page_header_set_field (page=0x7ffff0838000 "", page_zip=0x7ffff0324750, field=16, val=1) at /mariadb/10.3/storage/innobase/include/page0page.ic:177 #5 0x0000555556207234 in page_zip_dir_delete (page_zip=0x7ffff0324750, rec=0x7ffff083809a "\200", index=0x7fff8c0729d8, offsets=0x7fff8d6323b8, free=0x0) at /mariadb/10.3/storage/innobase/page/page0zip.cc:4531 #6 0x00005555561e51da in page_mem_free (page=0x7ffff0838000 "", page_zip=0x7ffff0324750, rec=0x7ffff083809a "\200", index=0x7fff8c0729d8, offsets=0x7fff8d6323b8) at /mariadb/10.3/storage/innobase/include/page0page.ic:1048 #7 0x00005555561ec4f0 in page_cur_delete_rec (cursor=0x7ffff00502f8, index=0x7fff8c0729d8, offsets=0x7fff8d6323b8, mtr=0x7ffff0050920) at /mariadb/10.3/storage/innobase/page/page0cur.cc:2421 #8 0x0000555556375b86 in btr_cur_pessimistic_delete (err=0x7ffff00502e4, has_reserved_extents=1, cursor=0x7ffff00502f0, flags=16, rollback=false, mtr=0x7ffff0050920) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:5833 #9 0x000055555634badc in btr_node_ptr_delete (index=0x7fff8c0729d8, block=0x7ffff0331fe0, mtr=0x7ffff0050920) at /mariadb/10.3/storage/innobase/btr/btr0btr.cc:3490 #10 0x000055555634ef72 in btr_discard_page (cursor=0x7fff8cebc3d8, mtr=0x7ffff0050920) at /mariadb/10.3/storage/innobase/btr/btr0btr.cc:4429 #11 0x00005555563757fc in btr_cur_pessimistic_delete (err=0x7ffff00508b8, has_reserved_extents=0, cursor=0x7fff8cebc3d8, flags=0, rollback=true, mtr=0x7ffff0050920) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:5749 #12 0x00005555564b05e4 in row_undo_ins_remove_clust_rec (node=0x7fff8cebc368) at /mariadb/10.3/storage/innobase/row/row0uins.cc:210 #13 0x00005555564b17a8 in row_undo_ins (node=0x7fff8cebc368, thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/row/row0uins.cc:612 #14 0x00005555562bd832 in row_undo (node=0x7fff8cebc368, thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/row/row0undo.cc:301 #15 0x00005555562bda94 in row_undo_step (thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/row/row0undo.cc:360 #16 0x0000555556214543 in que_thr_step (thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/que/que0que.cc:1040 #17 0x0000555556214797 in que_run_threads_low (thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/que/que0que.cc:1104 #18 0x0000555556214989 in que_run_threads (thr=0x7fff8c07b498) at /mariadb/10.3/storage/innobase/que/que0que.cc:1144 #19 0x000055555631b550 in trx_rollback_to_savepoint_low (trx=0x7ffff0b4b0f0, savept=0x7ffff0051530) at /mariadb/10.3/storage/innobase/trx/trx0roll.cc:141 #20 0x000055555631b905 in trx_rollback_to_savepoint (trx=0x7ffff0b4b0f0, savept=0x7ffff0051530) at /mariadb/10.3/storage/innobase/trx/trx0roll.cc:189 #21 0x000055555626b2e5 in row_mysql_handle_errors (new_err=0x7ffff0051528, trx=0x7ffff0b4b0f0, thr=0x7fff8c2efdb0, savept=0x7ffff0051530) at /mariadb/10.3/storage/innobase/row/row0mysql.cc:735 #22 0x000055555626d1a1 in row_insert_for_mysql (mysql_rec=0x7fff8c047828 "\b", prebuilt=0x7fff8c2edb18, ins_mode=ROW_INS_NORMAL) at /mariadb/10.3/storage/innobase/row/row0mysql.cc:1473 #23 0x0000555556116534 in ha_innobase::write_row (this=0x7fff8c04e6f0, record=0x7fff8c047828 "\b") at /mariadb/10.3/storage/innobase/handler/ha_innodb.cc:8086 An examination of the code suggests that btr_cur_pessimistic_delete() is not going to shrink the tree, although we should: btr_discard_page(cursor, mtr);   ret = TRUE; goto return_after_reservations; Ultimately, the mini-transaction will be committed in row_undo_ins_remove_clust_rec() , leaving the index tree in a corrupted state (with an internal page that does not have multiple child pages).

            The scenario occurs on the 3rd invocation of row_undo_ins_remove_clust_rec() in the MDEV-14126 test case. At this point, the clustered index tree consists of 3 pages:

            • page 3 (the root page) pointing to pages 69 and 70
            • page 69 containing 1 record
            • page 70 likewise containing the 1 record that is being rolled back

            After the rollback, we should end up with page 3 being the only index page, containing the entire contents of the table. But, btr_discard_page() fails to invoke btr_lift_page_up() even though after it has invoked btr_node_ptr_delete() on the parent page, there would be only 1 record left in the parent page.

            marko Marko Mäkelä added a comment - The scenario occurs on the 3rd invocation of row_undo_ins_remove_clust_rec() in the MDEV-14126 test case. At this point, the clustered index tree consists of 3 pages: page 3 (the root page) pointing to pages 69 and 70 page 69 containing 1 record page 70 likewise containing the 1 record that is being rolled back After the rollback, we should end up with page 3 being the only index page, containing the entire contents of the table. But, btr_discard_page() fails to invoke btr_lift_page_up() even though after it has invoked btr_node_ptr_delete() on the parent page, there would be only 1 record left in the parent page.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.