Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14126

Assertion `page_get_page_no(page) == index->page' failed in btr_pcur_store_position

Details

    Description

      10.3 b23a1096956c21df037bd851494f11509b5514dd

      mysqld: /home/travis/src/storage/innobase/btr/btr0pcur.cc:138: void btr_pcur_store_position(btr_pcur_t*, mtr_t*): Assertion `page_get_page_no(page) == index->page' failed.
      171025 4:07:11 [ERROR] mysqld got signal 6 ;
       
      # 2017-10-25T04:07:24 [24840] #3 0x00007fe5b8626ca2 in __GI___assert_fail (assertion=0x100e7b17828 "page_get_page_no(page) == index->page", file=0x100e7b17650 "/home/travis/src/storage/innobase/btr/btr0pcur.cc", line=138, function=0x100e7b17d00 <btr_pcur_store_position(btr_pcur_t*, mtr_t*)::__PRETTY_FUNCTION__> "void btr_pcur_store_position(btr_pcur_t*, mtr_t*)") at assert.c:101
      # 2017-10-25T04:07:24 [24840] #4 0x00000100e755d467 in btr_pcur_store_position (cursor=0x7fe54c5e4600, mtr=0x7fe5b51ca3a0) at /home/travis/src/storage/innobase/btr/btr0pcur.cc:138
      # 2017-10-25T04:07:24 [24840] #5 0x00000100e7478473 in row_search_mvcc (buf=0x7fe54c64e468 "\377\377\377", mode=PAGE_CUR_G, prebuilt=0x7fe54c5e4428, match_mode=0, direction=0) at /home/travis/src/storage/innobase/row/row0sel.cc:5611
      # 2017-10-25T04:07:24 [24840] #6 0x00000100e72df877 in ha_innobase::index_read (this=0x7fe54c3ec3e8, buf=0x7fe54c64e468 "\377\377\377", key_ptr=0x0, key_len=0, find_flag=HA_READ_AFTER_KEY) at /home/travis/src/storage/innobase/handler/ha_innodb.cc:9599
      # 2017-10-25T04:07:24 [24840] #7 0x00000100e72e0c88 in ha_innobase::index_first (this=0x7fe54c3ec3e8, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/storage/innobase/handler/ha_innodb.cc:10037
      # 2017-10-25T04:07:24 [24840] #8 0x00000100e72e0f08 in ha_innobase::rnd_next (this=0x7fe54c3ec3e8, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/storage/innobase/handler/ha_innodb.cc:10133
      # 2017-10-25T04:07:24 [24840] #9 0x00000100e6fffdec in handler::ha_rnd_next (this=0x7fe54c3ec3e8, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/sql/handler.cc:2593
      # 2017-10-25T04:07:24 [24840] #10 0x00000100e7787a4d in ha_partition::rnd_next (this=0x7fe54c07a188, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/sql/ha_partition.cc:4947
      # 2017-10-25T04:07:24 [24840] #11 0x00000100e6fffdec in handler::ha_rnd_next (this=0x7fe54c07a188, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/sql/handler.cc:2593
      # 2017-10-25T04:07:24 [24840] #12 0x00000100e7172bda in rr_sequential (info=0x7fe54c02d2c0) at /home/travis/src/sql/records.cc:485
      # 2017-10-25T04:07:24 [24840] #13 0x00000100e6cab289 in READ_RECORD::read_record (this=0x7fe54c02d2c0) at /home/travis/src/sql/records.h:73
      # 2017-10-25T04:07:24 [24840] #14 0x00000100e6da7773 in join_init_read_record (tab=0x7fe54c02d1f8) at /home/travis/src/sql/sql_select.cc:19793
      # 2017-10-25T04:07:24 [24840] #15 0x00000100e6da553c in sub_select (join=0x7fe54c02c220, join_tab=0x7fe54c02d1f8, end_of_records=false) at /home/travis/src/sql/sql_select.cc:18868
      # 2017-10-25T04:07:24 [24840] #16 0x00000100e6da4b0a in do_select (join=0x7fe54c02c220, procedure=0x0) at /home/travis/src/sql/sql_select.cc:18411
      # 2017-10-25T04:07:24 [24840] #17 0x00000100e6d7d70e in JOIN::exec_inner (this=0x7fe54c02c220) at /home/travis/src/sql/sql_select.cc:3548
      # 2017-10-25T04:07:24 [24840] #18 0x00000100e6d7cbae in JOIN::exec (this=0x7fe54c02c220) at /home/travis/src/sql/sql_select.cc:3343
      # 2017-10-25T04:07:24 [24840] #19 0x00000100e6d7ddc2 in mysql_select (thd=0x7fe54c0151a0, tables=0x7fe54c02bb20, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=551903562496, result=0x7fe54c02c200, unit=0x7fe54c018e60, select_lex=0x7fe54c019598) at /home/travis/src/sql/sql_select.cc:3743
      # 2017-10-25T04:07:24 [24840] #20 0x00000100e6d722da in handle_select (thd=0x7fe54c0151a0, lex=0x7fe54c018d98, result=0x7fe54c02c200, setup_tables_done_option=0) at /home/travis/src/sql/sql_select.cc:378
      # 2017-10-25T04:07:24 [24840] #21 0x00000100e6d3d844 in execute_sqlcom_select (thd=0x7fe54c0151a0, all_tables=0x7fe54c02bb20) at /home/travis/src/sql/sql_parse.cc:6467
      # 2017-10-25T04:07:24 [24840] #22 0x00000100e6d33da0 in mysql_execute_command (thd=0x7fe54c0151a0) at /home/travis/src/sql/sql_parse.cc:3731
      # 2017-10-25T04:07:24 [24840] #23 0x00000100e6d4112b in mysql_parse (thd=0x7fe54c0151a0, rawbuf=0x7fe54c02b798 "SELECT `col_set_ucs2` FROM `table100_innodb_key_pk_parts_2_int_autoinc` /* QNO 342 CON_ID 15 */", length=95, parser_state=0x7fe5b51cc620, is_com_multi=false, is_next_command=false) at /home/travis/src/sql/sql_parse.cc:7921
      # 2017-10-25T04:07:24 [24840] #24 0x00000100e6d2e903 in dispatch_command (command=COM_QUERY, thd=0x7fe54c0151a0, packet=0x7fe54c0604a1 "", packet_length=96, is_com_multi=false, is_next_command=false) at /home/travis/src/sql/sql_parse.cc:1819
      # 2017-10-25T04:07:24 [24840] #25 0x00000100e6d2d36d in do_command (thd=0x7fe54c0151a0) at /home/travis/src/sql/sql_parse.cc:1370
      # 2017-10-25T04:07:24 [24840] #26 0x00000100e6e84c52 in do_handle_one_connection (connect=0x100eac79450) at /home/travis/src/sql/sql_connect.cc:1418
      # 2017-10-25T04:07:24 [24840] #27 0x00000100e6e849df in handle_one_connection (arg=0x100eac79450) at /home/travis/src/sql/sql_connect.cc:1324
      # 2017-10-25T04:07:24 [24840] #28 0x00007fe5b91e8184 in start_thread (arg=0x7fe5b51cd700) at pthread_create.c:312
      # 2017-10-25T04:07:24 [24840] #29 0x00007fe5b86f4ffd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      


      Later update:

      The test case is attached, steps to reproduce are described in the comment.

      Not reproducible on 10.2.
      No obvious problems on a non-debug build.

      Attachments

        1. dt.7z
          8.05 MB
        2. earth11k.jpg
          earth11k.jpg
          11 kB
        3. earth15kb.jpg
          earth15kb.jpg
          16 kB
        4. earth1886kb.jpg
          earth1886kb.jpg
          1.84 MB
        5. earth215kb.jpg
          earth215kb.jpg
          211 kB
        6. earth2kb.jpg
          earth2kb.jpg
          2 kB
        7. earth579kb.jpg
          earth579kb.jpg
          579 kB
        8. earth5kb.jpg
          earth5kb.jpg
          5 kB
        9. earth81kb.jpg
          earth81kb.jpg
          81 kB
        10. mdev14126.test
          857 kB

        Issue Links

          Activity

            I cannot reproduce the crash on an ASAN-instrumented executable, when setting a conditional breakpoint in gdb, or when disabling purge by setting --innodb-force-recovery=2. This supports the hypothesis that a race condition between purge and DML is involved.

            marko Marko Mäkelä added a comment - I cannot reproduce the crash on an ASAN-instrumented executable, when setting a conditional breakpoint in gdb, or when disabling purge by setting --innodb-force-recovery=2 . This supports the hypothesis that a race condition between purge and DML is involved.

            This is a very elusive bug. The test case involves ROW_FORMAT=COMPRESSED and a lot of BLOB page creation and deletion. The uncompressed page frames of the compressed pages are being evicted. The empty non-leaf page (page number 69 or 70, depending on run) is several times used for writing BLOBs. If I add too much instrumentation, the bug will stop repeating.

            marko Marko Mäkelä added a comment - This is a very elusive bug. The test case involves ROW_FORMAT=COMPRESSED and a lot of BLOB page creation and deletion. The uncompressed page frames of the compressed pages are being evicted. The empty non-leaf page (page number 69 or 70, depending on run) is several times used for writing BLOBs. If I add too much instrumentation, the bug will stop repeating.

            Finally, I nailed down the cause. It turns out that the predicate page_is_root() is holding for all pages that have no siblings. InnoDB does sometimes leave behind a B-tree where a parent page only has 1 child page. With the following patch, the assertion will fail at the time when the corruption is about to occur:

            diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc
            index c8951f7e01c..71e3d0f2494 100644
            --- a/storage/innobase/btr/btr0cur.cc
            +++ b/storage/innobase/btr/btr0cur.cc
            @@ -5460,6 +5460,7 @@ btr_cur_optimistic_delete_func(
             			  && page_get_n_recs(block->frame) == 1
             			  + (cursor->index->is_instant()
             			     && !rec_is_metadata(rec, cursor->index)))) {
            +		ut_ad(block->page.id.page_no() == cursor->index->page);
             		/* The whole index (and table) becomes logically empty.
             		Empty the whole page. That is, if we are deleting the
             		only user record, also delete the metadata record
            

            The predicate page_is_root() was introduced by me in MySQL 5.7.6 and merged to MariaDB 10.2.2.

            I think that independently of fixing this bug, we must find out which operation caused the leaf page to lose its siblings. Because such degenerate non-branching B-trees can theoretically exist in any InnoDB data files, we must remove the predicate page_is_root() and instead compare the page number to the index root page number.

            marko Marko Mäkelä added a comment - Finally, I nailed down the cause. It turns out that the predicate page_is_root() is holding for all pages that have no siblings. InnoDB does sometimes leave behind a B-tree where a parent page only has 1 child page. With the following patch, the assertion will fail at the time when the corruption is about to occur: diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc index c8951f7e01c..71e3d0f2494 100644 --- a/storage/innobase/btr/btr0cur.cc +++ b/storage/innobase/btr/btr0cur.cc @@ -5460,6 +5460,7 @@ btr_cur_optimistic_delete_func( && page_get_n_recs(block->frame) == 1 + (cursor->index->is_instant() && !rec_is_metadata(rec, cursor->index)))) { + ut_ad(block->page.id.page_no() == cursor->index->page); /* The whole index (and table) becomes logically empty. Empty the whole page. That is, if we are deleting the only user record, also delete the metadata record The predicate page_is_root() was introduced by me in MySQL 5.7.6 and merged to MariaDB 10.2.2. I think that independently of fixing this bug, we must find out which operation caused the leaf page to lose its siblings. Because such degenerate non-branching B-trees can theoretically exist in any InnoDB data files, we must remove the predicate page_is_root() and instead compare the page number to the index root page number.

            I pushed a fix and additional assertions to bb-10.3-marko for testing. The assertions once caused a crash in the test innodb.innodb-change-buffer-recovery, because a change buffer merge was emptying a leaf page. So, there is hope for finding more InnoDB corruption bugs. If these assertions are failing too often, then we can for now take the HEAD^ version (and HEAD^^2 instead of HEAD^2 for the 10.2 version).

            Additional assertions will be needed for catching MDEV-19022, using the test case from this report. We should assert that we never leave a leaf page that only carries 1 record and has no sibling pages, except if the leaf page is the root page.

            marko Marko Mäkelä added a comment - I pushed a fix and additional assertions to bb-10.3-marko for testing. The assertions once caused a crash in the test innodb.innodb-change-buffer-recovery , because a change buffer merge was emptying a leaf page. So, there is hope for finding more InnoDB corruption bugs. If these assertions are failing too often, then we can for now take the HEAD^ version (and HEAD^^2 instead of HEAD^2 for the 10.2 version). Additional assertions will be needed for catching MDEV-19022 , using the test case from this report. We should assert that we never leave a leaf page that only carries 1 record and has no sibling pages, except if the leaf page is the root page.

            The test innodb.innodb_bug14676111 demonstrates that we can indeed have internal B-tree pages that only have 1 child page pointer. In the following case the predicate page_is_root() would still work, because all pages except the root do have siblings:

            #current tree form
            #    (1, 5)
            #  (1, 4) (5)
            #(1, 3) (4) (5)
            

            However, in the following case, page_is_root() would wrongly hold on the useless second-level page:

            #deleting 1 record of 2 records don't cause merge artificially.
            #current tree form
            #      (1)
            #    (1)
            #  (1, 3) <- lift up this level next, when deleting node ptr
            #(1, 2) (3) <- merged next
            

            In MDEV-19022 we shall fix the logic so that the useless page will be discarded from the tree.

            marko Marko Mäkelä added a comment - The test innodb.innodb_bug14676111 demonstrates that we can indeed have internal B-tree pages that only have 1 child page pointer. In the following case the predicate page_is_root() would still work, because all pages except the root do have siblings: #current tree form # (1, 5) # (1, 4) (5) #(1, 3) (4) (5) However, in the following case, page_is_root() would wrongly hold on the useless second-level page: #deleting 1 record of 2 records don't cause merge artificially. #current tree form # (1) # (1) # (1, 3) <- lift up this level next, when deleting node ptr #(1, 2) (3) <- merged next In MDEV-19022 we shall fix the logic so that the useless page will be discarded from the tree.

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.