Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30009

InnoDB shutdown hangs when the change buffer is corrupted

Details

    Description

      It looks like MDEV-20934 was reintroduced in MariaDB 10.5, possibly due to MDEV-19514 or some changes.

      It turns out that as part of fixing MDEV-23399 (MariaDB 10.5.7), I actually disarmed the regression test innodb.ibuf_not_empty by making the slow shutdown step run with innodb_force_recovery=2, which will prevent a change buffer merge. The following effectively reverts that change:

      diff --git a/mysql-test/suite/innodb/t/ibuf_not_empty.test b/mysql-test/suite/innodb/t/ibuf_not_empty.test
      index 207427d9ab8..f5203a337bf 100644
      --- a/mysql-test/suite/innodb/t/ibuf_not_empty.test
      +++ b/mysql-test/suite/innodb/t/ibuf_not_empty.test
      @@ -106,7 +106,7 @@ truncate(FILE, $ps * $pages);
       close(FILE) || die "Unable to close $file";
       EOF
       
      ---let $restart_parameters=
      +--let $restart_parameters=--innodb-force-recovery=0
       --source include/start_mysqld.inc
       SET GLOBAL innodb_fast_shutdown=0;
       --source include/restart_mysqld.inc
      

      With that, we will observe a hang. The following includes the progress reporting of MDEV-29984:

      10.6 ae6ebafd819d48c965d2615fc78f1f950e0fbf40

      ...
      2022-11-14 16:40:19 0 [ERROR] InnoDB: Trying to read 4096 bytes at 69632 outside the bounds of the file: ./test/t1.ibd
      2022-11-14 16:40:19 0 [Note] /dev/shm/10.6m/sql/mariadbd (initiated by: root[root] @ localhost []): Normal shutdown
      2022-11-14 16:40:19 0 [Note] InnoDB: FTS optimize thread exiting.
      2022-11-14 16:40:19 0 [Note] InnoDB: to purge 9 transactions
      2022-11-14 16:40:19 0 [Note] InnoDB: Starting shutdown...
      2022-11-14 16:40:19 0 [Note] InnoDB: Dumping buffer pool(s) to /dev/shm/10.6m/mysql-test/var/mysqld.1/data/ib_buffer_pool
      2022-11-14 16:40:19 0 [Note] InnoDB: Buffer pool(s) dump completed at 221114 16:40:19
      2022-11-14 16:40:34 0 [Note] Completing change buffer merge; 409 page reads initiated; 1 change buffer pages remain
      2022-11-14 16:40:49 0 [Note] Completing change buffer merge; 409 page reads initiated; 1 change buffer pages remain
      2022-11-14 16:41:04 0 [Note] Completing change buffer merge; 409 page reads initiated; 1 change buffer pages remain
      2022-11-14 16:41:19 0 [Note] Completing change buffer merge; 409 page reads initiated; 1 change buffer pages remain
      221114 16:41:20 [ERROR] mysqld got signal 6 ;
      

      On 10.5, this modified test will cause a crash when the purge of transaction history is attempting to access a corrupted page, which was intentionally corrupted by the test.

      I believe that the case that made me file MDEV-29905 is actually about this type of a shutdown hang.

      Attachments

        Issue Links

          Activity

            Patch is in bb-10.6-MDEV-30009

            thiru Thirunarayanan Balathandayuthapani added a comment - Patch is in bb-10.6- MDEV-30009

            Thank you, this is a step to the right direction. I think that we need to enable the logic also in release builds. Furthermore, I think that we must ensure that the purge of transaction history will not add more entries to the change buffer during a slow shutdown.

            marko Marko Mäkelä added a comment - Thank you, this is a step to the right direction. I think that we need to enable the logic also in release builds. Furthermore, I think that we must ensure that the purge of transaction history will not add more entries to the change buffer during a slow shutdown.

            In an rr replay trace that was produced with innodb_change_buffering_debug=1, it turns out that we must remove the change buffer records in the if (!bitmap_bits) code path of ibuf_merge_or_delete_for_page(), to avoid causing corruption. We got an rr replay trace where changes had been buffered for a 3-field index, the index was dropped, created as a 2-field index and stored in the same page. During ibuf_read_merge_pages(), we reset the bitmap bits but skipped the call to ibuf_delete_recs(). Finally, during a change buffer merge, we attempted to merge bogus 3-field records to the 2-field index page and hit an assertion in page_validate() when misinterpreting the records of the new 2-field index page as 3-field ones:

            mysqld: /data/Server/bb-10.6-MDEV-30009/storage/innobase/rem/rem0rec.cc:443: void rec_init_offsets_comp_ordinary(const rec_t*, const dict_index_t*, rec_offs*, ulint, const dict_col_t::def_t*, rec_leaf_format) [with bool mblob = false; bool redundant_temp = false; rec_t = unsigned char; rec_offs = short unsigned int; ulint = long unsigned int]: Assertion `index->is_primary()' failed.
            

            This corruption occurred on an .ibd file. While ALTER TABLE…DROP/ADD INDEX was involved in this case, I believe that the corruption is possible even for DML, when lots of records are being deleted.

            In MDEV-20934 some code to remove change buffer records was added to this code path, but only in the case when shutdown is running. It was refactored on a merge to 10.5 due to MDEV-19514. The code was later revised in MDEV-23839.

            To avoid reintroducing the corruption MDEV-25783, we must thread carefully here. I think that outside shutdown, that code path should only delete the change buffer entries if (block) (in which case we are holding an exclusive latch on the index page). MDEV-25783 was actually caused by some code in ibuf_read_merge_pages() and not ibuf_merge_or_delete_for_page().

            marko Marko Mäkelä added a comment - In an rr replay trace that was produced with innodb_change_buffering_debug=1 , it turns out that we must remove the change buffer records in the if (!bitmap_bits) code path of ibuf_merge_or_delete_for_page() , to avoid causing corruption. We got an rr replay trace where changes had been buffered for a 3-field index, the index was dropped, created as a 2-field index and stored in the same page. During ibuf_read_merge_pages() , we reset the bitmap bits but skipped the call to ibuf_delete_recs() . Finally, during a change buffer merge, we attempted to merge bogus 3-field records to the 2-field index page and hit an assertion in page_validate() when misinterpreting the records of the new 2-field index page as 3-field ones: mysqld: /data/Server/bb-10.6-MDEV-30009/storage/innobase/rem/rem0rec.cc:443: void rec_init_offsets_comp_ordinary(const rec_t*, const dict_index_t*, rec_offs*, ulint, const dict_col_t::def_t*, rec_leaf_format) [with bool mblob = false; bool redundant_temp = false; rec_t = unsigned char; rec_offs = short unsigned int; ulint = long unsigned int]: Assertion `index->is_primary()' failed. This corruption occurred on an .ibd file. While ALTER TABLE…DROP/ADD INDEX was involved in this case, I believe that the corruption is possible even for DML, when lots of records are being deleted. In MDEV-20934 some code to remove change buffer records was added to this code path, but only in the case when shutdown is running. It was refactored on a merge to 10.5 due to MDEV-19514 . The code was later revised in MDEV-23839 . To avoid reintroducing the corruption MDEV-25783 , we must thread carefully here. I think that outside shutdown, that code path should only delete the change buffer entries if (block) (in which case we are holding an exclusive latch on the index page). MDEV-25783 was actually caused by some code in ibuf_read_merge_pages() and not ibuf_merge_or_delete_for_page() .
            marko Marko Mäkelä added a comment - This bug was featured in https://fosdem.org/2023/schedule/event/innodb_change_buffer/

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.