Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14126

Assertion `page_get_page_no(page) == index->page' failed in btr_pcur_store_position

Details

    Description

      10.3 b23a1096956c21df037bd851494f11509b5514dd

      mysqld: /home/travis/src/storage/innobase/btr/btr0pcur.cc:138: void btr_pcur_store_position(btr_pcur_t*, mtr_t*): Assertion `page_get_page_no(page) == index->page' failed.
      171025 4:07:11 [ERROR] mysqld got signal 6 ;
       
      # 2017-10-25T04:07:24 [24840] #3 0x00007fe5b8626ca2 in __GI___assert_fail (assertion=0x100e7b17828 "page_get_page_no(page) == index->page", file=0x100e7b17650 "/home/travis/src/storage/innobase/btr/btr0pcur.cc", line=138, function=0x100e7b17d00 <btr_pcur_store_position(btr_pcur_t*, mtr_t*)::__PRETTY_FUNCTION__> "void btr_pcur_store_position(btr_pcur_t*, mtr_t*)") at assert.c:101
      # 2017-10-25T04:07:24 [24840] #4 0x00000100e755d467 in btr_pcur_store_position (cursor=0x7fe54c5e4600, mtr=0x7fe5b51ca3a0) at /home/travis/src/storage/innobase/btr/btr0pcur.cc:138
      # 2017-10-25T04:07:24 [24840] #5 0x00000100e7478473 in row_search_mvcc (buf=0x7fe54c64e468 "\377\377\377", mode=PAGE_CUR_G, prebuilt=0x7fe54c5e4428, match_mode=0, direction=0) at /home/travis/src/storage/innobase/row/row0sel.cc:5611
      # 2017-10-25T04:07:24 [24840] #6 0x00000100e72df877 in ha_innobase::index_read (this=0x7fe54c3ec3e8, buf=0x7fe54c64e468 "\377\377\377", key_ptr=0x0, key_len=0, find_flag=HA_READ_AFTER_KEY) at /home/travis/src/storage/innobase/handler/ha_innodb.cc:9599
      # 2017-10-25T04:07:24 [24840] #7 0x00000100e72e0c88 in ha_innobase::index_first (this=0x7fe54c3ec3e8, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/storage/innobase/handler/ha_innodb.cc:10037
      # 2017-10-25T04:07:24 [24840] #8 0x00000100e72e0f08 in ha_innobase::rnd_next (this=0x7fe54c3ec3e8, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/storage/innobase/handler/ha_innodb.cc:10133
      # 2017-10-25T04:07:24 [24840] #9 0x00000100e6fffdec in handler::ha_rnd_next (this=0x7fe54c3ec3e8, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/sql/handler.cc:2593
      # 2017-10-25T04:07:24 [24840] #10 0x00000100e7787a4d in ha_partition::rnd_next (this=0x7fe54c07a188, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/sql/ha_partition.cc:4947
      # 2017-10-25T04:07:24 [24840] #11 0x00000100e6fffdec in handler::ha_rnd_next (this=0x7fe54c07a188, buf=0x7fe54c64e468 "\377\377\377") at /home/travis/src/sql/handler.cc:2593
      # 2017-10-25T04:07:24 [24840] #12 0x00000100e7172bda in rr_sequential (info=0x7fe54c02d2c0) at /home/travis/src/sql/records.cc:485
      # 2017-10-25T04:07:24 [24840] #13 0x00000100e6cab289 in READ_RECORD::read_record (this=0x7fe54c02d2c0) at /home/travis/src/sql/records.h:73
      # 2017-10-25T04:07:24 [24840] #14 0x00000100e6da7773 in join_init_read_record (tab=0x7fe54c02d1f8) at /home/travis/src/sql/sql_select.cc:19793
      # 2017-10-25T04:07:24 [24840] #15 0x00000100e6da553c in sub_select (join=0x7fe54c02c220, join_tab=0x7fe54c02d1f8, end_of_records=false) at /home/travis/src/sql/sql_select.cc:18868
      # 2017-10-25T04:07:24 [24840] #16 0x00000100e6da4b0a in do_select (join=0x7fe54c02c220, procedure=0x0) at /home/travis/src/sql/sql_select.cc:18411
      # 2017-10-25T04:07:24 [24840] #17 0x00000100e6d7d70e in JOIN::exec_inner (this=0x7fe54c02c220) at /home/travis/src/sql/sql_select.cc:3548
      # 2017-10-25T04:07:24 [24840] #18 0x00000100e6d7cbae in JOIN::exec (this=0x7fe54c02c220) at /home/travis/src/sql/sql_select.cc:3343
      # 2017-10-25T04:07:24 [24840] #19 0x00000100e6d7ddc2 in mysql_select (thd=0x7fe54c0151a0, tables=0x7fe54c02bb20, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=551903562496, result=0x7fe54c02c200, unit=0x7fe54c018e60, select_lex=0x7fe54c019598) at /home/travis/src/sql/sql_select.cc:3743
      # 2017-10-25T04:07:24 [24840] #20 0x00000100e6d722da in handle_select (thd=0x7fe54c0151a0, lex=0x7fe54c018d98, result=0x7fe54c02c200, setup_tables_done_option=0) at /home/travis/src/sql/sql_select.cc:378
      # 2017-10-25T04:07:24 [24840] #21 0x00000100e6d3d844 in execute_sqlcom_select (thd=0x7fe54c0151a0, all_tables=0x7fe54c02bb20) at /home/travis/src/sql/sql_parse.cc:6467
      # 2017-10-25T04:07:24 [24840] #22 0x00000100e6d33da0 in mysql_execute_command (thd=0x7fe54c0151a0) at /home/travis/src/sql/sql_parse.cc:3731
      # 2017-10-25T04:07:24 [24840] #23 0x00000100e6d4112b in mysql_parse (thd=0x7fe54c0151a0, rawbuf=0x7fe54c02b798 "SELECT `col_set_ucs2` FROM `table100_innodb_key_pk_parts_2_int_autoinc` /* QNO 342 CON_ID 15 */", length=95, parser_state=0x7fe5b51cc620, is_com_multi=false, is_next_command=false) at /home/travis/src/sql/sql_parse.cc:7921
      # 2017-10-25T04:07:24 [24840] #24 0x00000100e6d2e903 in dispatch_command (command=COM_QUERY, thd=0x7fe54c0151a0, packet=0x7fe54c0604a1 "", packet_length=96, is_com_multi=false, is_next_command=false) at /home/travis/src/sql/sql_parse.cc:1819
      # 2017-10-25T04:07:24 [24840] #25 0x00000100e6d2d36d in do_command (thd=0x7fe54c0151a0) at /home/travis/src/sql/sql_parse.cc:1370
      # 2017-10-25T04:07:24 [24840] #26 0x00000100e6e84c52 in do_handle_one_connection (connect=0x100eac79450) at /home/travis/src/sql/sql_connect.cc:1418
      # 2017-10-25T04:07:24 [24840] #27 0x00000100e6e849df in handle_one_connection (arg=0x100eac79450) at /home/travis/src/sql/sql_connect.cc:1324
      # 2017-10-25T04:07:24 [24840] #28 0x00007fe5b91e8184 in start_thread (arg=0x7fe5b51cd700) at pthread_create.c:312
      # 2017-10-25T04:07:24 [24840] #29 0x00007fe5b86f4ffd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      


      Later update:

      The test case is attached, steps to reproduce are described in the comment.

      Not reproducible on 10.2.
      No obvious problems on a non-debug build.

      Attachments

        1. dt.7z
          8.05 MB
          Alice Sherepa
        2. earth11k.jpg
          11 kB
          Elena Stepanova
        3. earth15kb.jpg
          16 kB
          Elena Stepanova
        4. earth1886kb.jpg
          1.84 MB
          Elena Stepanova
        5. earth215kb.jpg
          211 kB
          Elena Stepanova
        6. earth2kb.jpg
          2 kB
          Elena Stepanova
        7. earth579kb.jpg
          579 kB
          Elena Stepanova
        8. earth5kb.jpg
          5 kB
          Elena Stepanova
        9. earth81kb.jpg
          81 kB
          Elena Stepanova
        10. mdev14126.test
          857 kB
          Elena Stepanova

        Issue Links

          Activity

            I cannot reproduce the crash on an ASAN-instrumented executable, when setting a conditional breakpoint in gdb, or when disabling purge by setting --innodb-force-recovery=2. This supports the hypothesis that a race condition between purge and DML is involved.

            marko Marko Mäkelä added a comment - I cannot reproduce the crash on an ASAN-instrumented executable, when setting a conditional breakpoint in gdb, or when disabling purge by setting --innodb-force-recovery=2 . This supports the hypothesis that a race condition between purge and DML is involved.

            This is a very elusive bug. The test case involves ROW_FORMAT=COMPRESSED and a lot of BLOB page creation and deletion. The uncompressed page frames of the compressed pages are being evicted. The empty non-leaf page (page number 69 or 70, depending on run) is several times used for writing BLOBs. If I add too much instrumentation, the bug will stop repeating.

            marko Marko Mäkelä added a comment - This is a very elusive bug. The test case involves ROW_FORMAT=COMPRESSED and a lot of BLOB page creation and deletion. The uncompressed page frames of the compressed pages are being evicted. The empty non-leaf page (page number 69 or 70, depending on run) is several times used for writing BLOBs. If I add too much instrumentation, the bug will stop repeating.

            Finally, I nailed down the cause. It turns out that the predicate page_is_root() is holding for all pages that have no siblings. InnoDB does sometimes leave behind a B-tree where a parent page only has 1 child page. With the following patch, the assertion will fail at the time when the corruption is about to occur:

            diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc
            index c8951f7e01c..71e3d0f2494 100644
            --- a/storage/innobase/btr/btr0cur.cc
            +++ b/storage/innobase/btr/btr0cur.cc
            @@ -5460,6 +5460,7 @@ btr_cur_optimistic_delete_func(
             			  && page_get_n_recs(block->frame) == 1
             			  + (cursor->index->is_instant()
             			     && !rec_is_metadata(rec, cursor->index)))) {
            +		ut_ad(block->page.id.page_no() == cursor->index->page);
             		/* The whole index (and table) becomes logically empty.
             		Empty the whole page. That is, if we are deleting the
             		only user record, also delete the metadata record
            

            The predicate page_is_root() was introduced by me in MySQL 5.7.6 and merged to MariaDB 10.2.2.

            I think that independently of fixing this bug, we must find out which operation caused the leaf page to lose its siblings. Because such degenerate non-branching B-trees can theoretically exist in any InnoDB data files, we must remove the predicate page_is_root() and instead compare the page number to the index root page number.

            marko Marko Mäkelä added a comment - Finally, I nailed down the cause. It turns out that the predicate page_is_root() is holding for all pages that have no siblings. InnoDB does sometimes leave behind a B-tree where a parent page only has 1 child page. With the following patch, the assertion will fail at the time when the corruption is about to occur: diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc index c8951f7e01c..71e3d0f2494 100644 --- a/storage/innobase/btr/btr0cur.cc +++ b/storage/innobase/btr/btr0cur.cc @@ -5460,6 +5460,7 @@ btr_cur_optimistic_delete_func( && page_get_n_recs(block->frame) == 1 + (cursor->index->is_instant() && !rec_is_metadata(rec, cursor->index)))) { + ut_ad(block->page.id.page_no() == cursor->index->page); /* The whole index (and table) becomes logically empty. Empty the whole page. That is, if we are deleting the only user record, also delete the metadata record The predicate page_is_root() was introduced by me in MySQL 5.7.6 and merged to MariaDB 10.2.2. I think that independently of fixing this bug, we must find out which operation caused the leaf page to lose its siblings. Because such degenerate non-branching B-trees can theoretically exist in any InnoDB data files, we must remove the predicate page_is_root() and instead compare the page number to the index root page number.

            I pushed a fix and additional assertions to bb-10.3-marko for testing. The assertions once caused a crash in the test innodb.innodb-change-buffer-recovery, because a change buffer merge was emptying a leaf page. So, there is hope for finding more InnoDB corruption bugs. If these assertions are failing too often, then we can for now take the HEAD^ version (and HEAD^^2 instead of HEAD^2 for the 10.2 version).

            Additional assertions will be needed for catching MDEV-19022, using the test case from this report. We should assert that we never leave a leaf page that only carries 1 record and has no sibling pages, except if the leaf page is the root page.

            marko Marko Mäkelä added a comment - I pushed a fix and additional assertions to bb-10.3-marko for testing. The assertions once caused a crash in the test innodb.innodb-change-buffer-recovery , because a change buffer merge was emptying a leaf page. So, there is hope for finding more InnoDB corruption bugs. If these assertions are failing too often, then we can for now take the HEAD^ version (and HEAD^^2 instead of HEAD^2 for the 10.2 version). Additional assertions will be needed for catching MDEV-19022 , using the test case from this report. We should assert that we never leave a leaf page that only carries 1 record and has no sibling pages, except if the leaf page is the root page.

            The test innodb.innodb_bug14676111 demonstrates that we can indeed have internal B-tree pages that only have 1 child page pointer. In the following case the predicate page_is_root() would still work, because all pages except the root do have siblings:

            #current tree form
            #    (1, 5)
            #  (1, 4) (5)
            #(1, 3) (4) (5)
            

            However, in the following case, page_is_root() would wrongly hold on the useless second-level page:

            #deleting 1 record of 2 records don't cause merge artificially.
            #current tree form
            #      (1)
            #    (1)
            #  (1, 3) <- lift up this level next, when deleting node ptr
            #(1, 2) (3) <- merged next
            

            In MDEV-19022 we shall fix the logic so that the useless page will be discarded from the tree.

            marko Marko Mäkelä added a comment - The test innodb.innodb_bug14676111 demonstrates that we can indeed have internal B-tree pages that only have 1 child page pointer. In the following case the predicate page_is_root() would still work, because all pages except the root do have siblings: #current tree form # (1, 5) # (1, 4) (5) #(1, 3) (4) (5) However, in the following case, page_is_root() would wrongly hold on the useless second-level page: #deleting 1 record of 2 records don't cause merge artificially. #current tree form # (1) # (1) # (1, 3) <- lift up this level next, when deleting node ptr #(1, 2) (3) <- merged next In MDEV-19022 we shall fix the logic so that the useless page will be discarded from the tree.

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.