Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13542

Crashing on a corrupted page is unhelpful

Details

    Description

      Since the very first version, InnoDB was aborting the whole MySQL or MariaDB server when trying to access a corrupted page.

      Earlier, small steps have been taken to improve MariaDB stability, such as MDEV-12253.

      We must enable error handling for corrupted pages instead of allowing the server to crash. This is not only about detecting a checksum failure when reading a corrupted page (like suggested in MySQL Bug#10132 Crashing the server on corrupt InnoDB page is unhelpful, filed on April 25, 2005), but also about performing some operations on corrupted pages whose checksum appears valid.

      In some background operations, such as the purge of committed transaction history, errors may be reported to the server error log only, or silently ignored. While executing SQL, errors should be reported back to the client, and the transaction could be rolled back.

      Attachments

        Issue Links

          Activity

            MDEV-23344 reports a crash when trying to access a page that does not exist in the tablespace.

            marko Marko Mäkelä added a comment - MDEV-23344 reports a crash when trying to access a page that does not exist in the tablespace.

            When it comes to reading corrupted pages into the buffer pool, we seem to already do the right thing with regard to buf_page_is_corrupted() and buf_page_t::read_complete(). One relevant change was MDEV-19541.

            It looks like the main part of the fix would be to check that error results from buf_page_get_gen() will be propagated from the caller.

            I would exclude that in change buffer merge from this. We have seen crashes in btr_page_reorganize() that were invoked from ibuf_insert_to_index_page_low() during ibuf_merge_or_delete_for_page(). Due to the design of the change buffer, such crashes can also occur during CHECK TABLE, where some extra attention has been spent to avoid crashes due to corruption. We already disabled the change buffer by default (MDEV-27734) and deprecated it (MDEV-27735) due to corruption that we are unable to reproduce in our internal testing.

            marko Marko Mäkelä added a comment - When it comes to reading corrupted pages into the buffer pool, we seem to already do the right thing with regard to buf_page_is_corrupted() and buf_page_t::read_complete() . One relevant change was MDEV-19541 . It looks like the main part of the fix would be to check that error results from buf_page_get_gen() will be propagated from the caller. I would exclude that in change buffer merge from this. We have seen crashes in btr_page_reorganize() that were invoked from ibuf_insert_to_index_page_low() during ibuf_merge_or_delete_for_page() . Due to the design of the change buffer, such crashes can also occur during CHECK TABLE , where some extra attention has been spent to avoid crashes due to corruption. We already disabled the change buffer by default ( MDEV-27734 ) and deprecated it ( MDEV-27735 ) due to corruption that we are unable to reproduce in our internal testing.

            It looks like we already have a large part of these robustness fixes present in 10.5 (added by MDEV-15528 and related changes) or 10.6 (added in MDEV-25506 and changes related to atomic DDL). I already went through all direct callers of buf_page_get_gen(), and there is not that much to be fixed. I still have to check all callers of some 12 functions and then implement some fault injection in buf_page_get_gen() so that we can run some stress tests.

            I would declare that those linked open bugs that are related to some other forms of corruption (say, a garbage BLOB pointer or DB_ROLL_PTR, or an incorrect key specified for decrypting an encrypted page) can be out of the scope of this fix. This fix should mainly be about handling failures to read a page.

            marko Marko Mäkelä added a comment - It looks like we already have a large part of these robustness fixes present in 10.5 (added by MDEV-15528 and related changes) or 10.6 (added in MDEV-25506 and changes related to atomic DDL). I already went through all direct callers of buf_page_get_gen() , and there is not that much to be fixed. I still have to check all callers of some 12 functions and then implement some fault injection in buf_page_get_gen() so that we can run some stress tests. I would declare that those linked open bugs that are related to some other forms of corruption (say, a garbage BLOB pointer or DB_ROLL_PTR , or an incorrect key specified for decrypting an encrypted page) can be out of the scope of this fix. This fix should mainly be about handling failures to read a page.

            The RQG testing on the development tree
                 origin/bb-10.6-MDEV-13542 dca6937a30eeee56f46fc47ac85656243227e4d2 2022-05-16T16:31:15+03:00
            showed frequent
            - SEGV's with a backtrace like in the description on top of MDEV-25257
            - cases where the server error log contains the message '[ERROR] mysqld got signal 11' but the server process
               did not disappear even after more than 300s waiting --> no core file nor rr trace
            After applying Thiru's initial patch for MDEV-25257
                diff --git a/storage/innobase/dict/dict0load.cc b/storage/innobase/dict/dict0load.cc
            index 20916c2c96e..2c3d48b9573 100644
            --- a/storage/innobase/dict/dict0load.cc
            +++ b/storage/innobase/dict/dict0load.cc
            @@ -1383,6 +1383,7 @@ dict_load_columns(
                                    the flag is set before the table is created. */
                                    if (table->fts == NULL) {
                                            table->fts = fts_create(table);
            +                               table->fts->cache = fts_cache_create(table);
                                    }
                                    ut_a(table->fts->doc_col == ULINT_UNDEFINED);
             to the tree bb-10.6-MDEV-13542 both bad effects were no more seen again in testing.
             
            Now some new dominant problem showed up.
            [rr 3532891 209362]mysqld: /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2682: buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, mtr_t*, dberr_t*, bool): Assertion `mode == 11 || mode == 12 || block->zip_size() == zip_size' failed.
            (rr) bt
            #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
            #1  0x00007fcf37482859 in __GI_abort () at abort.c:79
            #2  0x00007fcf37482729 in __assert_fail_base (fmt=0x7fcf37618588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55996a841da0 "mode == 11 || mode == 12 || block->zip_size() == zip_size", 
                file=0x55996a83d680 "/data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc", line=2682, function=<optimized out>) at assert.c:92
            #3  0x00007fcf37493f36 in __GI___assert_fail (assertion=0x55996a841da0 "mode == 11 || mode == 12 || block->zip_size() == zip_size", file=0x55996a83d680 "/data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc", 
                line=2682, function=0x55996a841880 "buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, mtr_t*, dberr_t*, bool)") at assert.c:101
            #4  0x00005599696fa35f in buf_page_get_low (page_id=..., zip_size=0, rw_latch=8, guess=0x0, mode=16, mtr=0x7fcf1a61aec0, err=0x7fcf1a61a770, allow_ibuf_merge=false)
                at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2682
            #5  0x00005599696fbaad in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=8, guess=0x0, mode=16, mtr=0x7fcf1a61aec0, err=0x7fcf1a61a770, allow_ibuf_merge=false)
                at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2984
            #6  0x00005599696ae711 in btr_cur_pess_upd_restore_supremum (block=0x7fcf2aea57e0, rec=0x7fcf2b49e781 "", mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/btr/btr0cur.cc:4782
            #7  0x00005599696b1f88 in btr_cur_pessimistic_update (flags=10, cursor=0x61100005f180, offsets=0x7fcf1a61aac0, offsets_heap=0x7fcf1a61abc0, entry_heap=0x619000bf9080, big_rec=0x7fcf1a61aaa0, update=0x62100026f3f0, cmpl_info=0, 
                thr=0x62100026f750, trx_id=1719, mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/btr/btr0cur.cc:5243
            #8  0x000055996956ff1b in row_upd_clust_rec (flags=0, node=0x62100026f2c0, index=0x616002f91a08, offsets=0x619000bf9850, offsets_heap=0x7fcf1a61abc0, thr=0x62100026f750, mtr=0x7fcf1a61aec0)
                at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2456
            #9  0x0000559969571812 in row_upd_clust_step (node=0x62100026f2c0, thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2729
            #10 0x0000559969571f69 in row_upd (node=0x62100026f2c0, thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2792
            #11 0x0000559969572a6b in row_upd_step (thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2934
            #12 0x00005599694d46e5 in row_update_for_mysql (prebuilt=0x62100026e988) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0mysql.cc:1691
            #13 0x000055996914b046 in ha_innobase::update_row (this=0x61d0007f94b8, old_row=0x61a0003c80f8 <incomplete sequence \375>, new_row=0x61a0003c7eb8 "\265")
                at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/handler/ha_innodb.cc:8668
            #14 0x00005599688687dd in handler::ha_update_row (this=0x61d0007f94b8, old_data=0x61a0003c80f8 <incomplete sequence \375>, new_data=0x61a0003c7eb8 "\265") at /data/Server/bb-10.6-MDEV-13542G/sql/handler.cc:7602
            #15 0x00005599683484c9 in mysql_update (thd=0x62b00009a218, table_list=0x62b0000a13f8, fields=..., values=..., conds=0x0, order_num=0, order=0x0, limit=18446744073709551602, ignore=false, found_return=0x7fcf1a61c2f0, 
                updated_return=0x7fcf1a61c310) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_update.cc:1087
            #16 0x000055996806794f in mysql_execute_command (thd=0x62b00009a218, is_called_from_prepared_stmt=false) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:4423
            #17 0x000055996807fd40 in mysql_parse (thd=0x62b00009a218, rawbuf=0x62b0000a1238 "UPDATE t2 SET col_text = REPEAT(SUBSTR(CAST( 2361 AS CHAR),1,1), @fill_amount)  /* E_R Thread2 QNO 1139 CON_ID 16 */", length=116, 
                parser_state=0x7fcf1a61cb20) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:8045
            #18 0x0000559968058133 in dispatch_command (command=COM_QUERY, thd=0x62b00009a218, packet=0x629000c53219 " UPDATE t2 SET col_text = REPEAT(SUBSTR(CAST( 2361 AS CHAR),1,1), @fill_amount)  /* E_R Thread2 QNO 1139 CON_ID 16 */ ", 
                packet_length=118, blocking=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:1912
            #19 0x0000559968055369 in do_command (thd=0x62b00009a218, blocking=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:1409
            #20 0x0000559968458c53 in do_handle_one_connection (connect=0x608000002eb8, put_in_cache=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_connect.cc:1418
            #21 0x00005599684584df in handle_one_connection (arg=0x608000002838) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_connect.cc:1312
            #22 0x00007fcf379ac609 in start_thread (arg=<optimized out>) at pthread_create.c:477
            #23 0x00007fcf3757f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
            (rr)
            pluto:/data/results/1652901598/TBR-1503/dev/shm/rqg/1652901598/57/1/rr
             
            This problem disappeared after applying Thiru's cursor_restore_supremum.patch.
            

            mleich Matthias Leich added a comment - The RQG testing on the development tree origin/bb-10.6-MDEV-13542 dca6937a30eeee56f46fc47ac85656243227e4d2 2022-05-16T16:31:15+03:00 showed frequent - SEGV's with a backtrace like in the description on top of MDEV-25257 - cases where the server error log contains the message '[ERROR] mysqld got signal 11' but the server process did not disappear even after more than 300s waiting --> no core file nor rr trace After applying Thiru's initial patch for MDEV-25257 diff --git a/storage/innobase/dict/dict0load.cc b/storage/innobase/dict/dict0load.cc index 20916c2c96e..2c3d48b9573 100644 --- a/storage/innobase/dict/dict0load.cc +++ b/storage/innobase/dict/dict0load.cc @@ -1383,6 +1383,7 @@ dict_load_columns( the flag is set before the table is created. */ if (table->fts == NULL) { table->fts = fts_create(table); + table->fts->cache = fts_cache_create(table); } ut_a(table->fts->doc_col == ULINT_UNDEFINED); to the tree bb-10.6-MDEV-13542 both bad effects were no more seen again in testing.   Now some new dominant problem showed up. [rr 3532891 209362]mysqld: /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2682: buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, mtr_t*, dberr_t*, bool): Assertion `mode == 11 || mode == 12 || block->zip_size() == zip_size' failed. (rr) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007fcf37482859 in __GI_abort () at abort.c:79 #2 0x00007fcf37482729 in __assert_fail_base (fmt=0x7fcf37618588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55996a841da0 "mode == 11 || mode == 12 || block->zip_size() == zip_size", file=0x55996a83d680 "/data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc", line=2682, function=<optimized out>) at assert.c:92 #3 0x00007fcf37493f36 in __GI___assert_fail (assertion=0x55996a841da0 "mode == 11 || mode == 12 || block->zip_size() == zip_size", file=0x55996a83d680 "/data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc", line=2682, function=0x55996a841880 "buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, mtr_t*, dberr_t*, bool)") at assert.c:101 #4 0x00005599696fa35f in buf_page_get_low (page_id=..., zip_size=0, rw_latch=8, guess=0x0, mode=16, mtr=0x7fcf1a61aec0, err=0x7fcf1a61a770, allow_ibuf_merge=false) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2682 #5 0x00005599696fbaad in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=8, guess=0x0, mode=16, mtr=0x7fcf1a61aec0, err=0x7fcf1a61a770, allow_ibuf_merge=false) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2984 #6 0x00005599696ae711 in btr_cur_pess_upd_restore_supremum (block=0x7fcf2aea57e0, rec=0x7fcf2b49e781 "", mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/btr/btr0cur.cc:4782 #7 0x00005599696b1f88 in btr_cur_pessimistic_update (flags=10, cursor=0x61100005f180, offsets=0x7fcf1a61aac0, offsets_heap=0x7fcf1a61abc0, entry_heap=0x619000bf9080, big_rec=0x7fcf1a61aaa0, update=0x62100026f3f0, cmpl_info=0, thr=0x62100026f750, trx_id=1719, mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/btr/btr0cur.cc:5243 #8 0x000055996956ff1b in row_upd_clust_rec (flags=0, node=0x62100026f2c0, index=0x616002f91a08, offsets=0x619000bf9850, offsets_heap=0x7fcf1a61abc0, thr=0x62100026f750, mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2456 #9 0x0000559969571812 in row_upd_clust_step (node=0x62100026f2c0, thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2729 #10 0x0000559969571f69 in row_upd (node=0x62100026f2c0, thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2792 #11 0x0000559969572a6b in row_upd_step (thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2934 #12 0x00005599694d46e5 in row_update_for_mysql (prebuilt=0x62100026e988) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0mysql.cc:1691 #13 0x000055996914b046 in ha_innobase::update_row (this=0x61d0007f94b8, old_row=0x61a0003c80f8 <incomplete sequence \375>, new_row=0x61a0003c7eb8 "\265") at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/handler/ha_innodb.cc:8668 #14 0x00005599688687dd in handler::ha_update_row (this=0x61d0007f94b8, old_data=0x61a0003c80f8 <incomplete sequence \375>, new_data=0x61a0003c7eb8 "\265") at /data/Server/bb-10.6-MDEV-13542G/sql/handler.cc:7602 #15 0x00005599683484c9 in mysql_update (thd=0x62b00009a218, table_list=0x62b0000a13f8, fields=..., values=..., conds=0x0, order_num=0, order=0x0, limit=18446744073709551602, ignore=false, found_return=0x7fcf1a61c2f0, updated_return=0x7fcf1a61c310) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_update.cc:1087 #16 0x000055996806794f in mysql_execute_command (thd=0x62b00009a218, is_called_from_prepared_stmt=false) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:4423 #17 0x000055996807fd40 in mysql_parse (thd=0x62b00009a218, rawbuf=0x62b0000a1238 "UPDATE t2 SET col_text = REPEAT(SUBSTR(CAST( 2361 AS CHAR),1,1), @fill_amount) /* E_R Thread2 QNO 1139 CON_ID 16 */", length=116, parser_state=0x7fcf1a61cb20) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:8045 #18 0x0000559968058133 in dispatch_command (command=COM_QUERY, thd=0x62b00009a218, packet=0x629000c53219 " UPDATE t2 SET col_text = REPEAT(SUBSTR(CAST( 2361 AS CHAR),1,1), @fill_amount) /* E_R Thread2 QNO 1139 CON_ID 16 */ ", packet_length=118, blocking=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:1912 #19 0x0000559968055369 in do_command (thd=0x62b00009a218, blocking=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:1409 #20 0x0000559968458c53 in do_handle_one_connection (connect=0x608000002eb8, put_in_cache=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_connect.cc:1418 #21 0x00005599684584df in handle_one_connection (arg=0x608000002838) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_connect.cc:1312 #22 0x00007fcf379ac609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #23 0x00007fcf3757f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (rr) pluto:/data/results/1652901598/TBR-1503/dev/shm/rqg/1652901598/57/1/rr   This problem disappeared after applying Thiru's cursor_restore_supremum.patch.

            thiru, thank you for cursor_restore_supremum.patch. I hope that my simpler solution works as well.

            Even today, the fault injection exposed a couple new crashes, which I fixed:

            ./mtr --parallel=auto --big-test --force --max-test-fail=0 --mysqld=--debug=d,intermittent_read_failure
            

            My main effort today was to prevent crashes on change buffer merge failure (motivated by MDEV-28349) as well as on page split or merge. Such corruption should not be a direct consequence of a page checksum mismatch, but something else (say, MDEV-28312, a bug in Galera snapshot transfer).

            marko Marko Mäkelä added a comment - thiru , thank you for cursor_restore_supremum.patch . I hope that my simpler solution works as well. Even today, the fault injection exposed a couple new crashes, which I fixed: ./mtr --parallel=auto --big-test --force --max-test-fail=0 --mysqld=--debug=d,intermittent_read_failure My main effort today was to prevent crashes on change buffer merge failure (motivated by MDEV-28349 ) as well as on page split or merge. Such corruption should not be a direct consequence of a page checksum mismatch, but something else (say, MDEV-28312 , a bug in Galera snapshot transfer).

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              7 Vote for this issue
              Watchers:
              25 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.