Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20934

Infinite loop on innodb_fast_shutdown=0 with inconsistent change buffer

Details

    Description

      Due to a data corruption bug in the past (such as MySQL Bug #69122 InnoDB doesn't redo-log insert buffer merge operation if it is done in-place) it seems possible that the InnoDB change buffer ends up containing entries, while no buffered changes exist according to the change buffer bitmap pages in the .ibd files.

      The logic on slow shutdown would proceed as follows:

      • ibuf_merge_pages() calls btr_pcur_open_at_rnd_pos(), which will find a change buffer leaf page
      • page numbers are read from the change buffer records
      • page reads requests will be posted
      • on read completion, ibuf_merge_or_delete_for_page() will be invoked
      • Alas, the bitmap page in the .ibd says that there are no buffered changes, and nothing will be done.
      • Because the ‘orphan’ records for the page were not deleted from the change buffer, this will keep looping.

      To fix this, I think that we should change the following code in ibuf_merge_or_delete_for_page():

      			if (!bitmap_bits) {
      				/* No inserts buffered for this page */
       
      				fil_space_release(space);
      				return;
      			}
      

      Before returning, we should check if slow shutdown is in progress. If yes, we should attempt to delete any change buffer entries for page_id. We should not try this during normal operation, because it would cause a lot of unnecessary work.

      Attachments

        Issue Links

          Activity

            I found a likely cause of the scenario that caused change buffer merge to hang. In ibuf_insert_low() we update the change buffer bitmap in a separate mini-transaction, ahead of writing the data to the change buffer:

            	/* Set the bitmap bit denoting that the insert buffer contains
            	buffered entries for this index page, if the bit is not set yet */
             
            	old_bit_value = ibuf_bitmap_page_get_bits(bitmap_page, page_no,
            					IBUF_BITMAP_BUFFERED, &bitmap_mtr);
            	if (!old_bit_value) {
            		ibuf_bitmap_page_set_bits(bitmap_page, page_no,
            				IBUF_BITMAP_BUFFERED, TRUE, &bitmap_mtr);
            	}
             
            	mtr_commit(&bitmap_mtr);
            

            The above was introduced with the initial commit of InnoDB into MySQL 3.23.34. If the server is killed or a backup is finished between the logical time of the commit of bitmap_mtr and the subsequent mini-transaction commit that inserts the record into the change buffer, then we will have the bitmap page indicating that there exist unbuffered changes for a page, although none might actually exist.

            I do not think that this non-atomicity can be fixed, so the change buffer merge will have to deal with this situation.

            marko Marko Mäkelä added a comment - I found a likely cause of the scenario that caused change buffer merge to hang. In ibuf_insert_low() we update the change buffer bitmap in a separate mini-transaction, ahead of writing the data to the change buffer: /* Set the bitmap bit denoting that the insert buffer contains buffered entries for this index page, if the bit is not set yet */   old_bit_value = ibuf_bitmap_page_get_bits(bitmap_page, page_no, IBUF_BITMAP_BUFFERED, &bitmap_mtr); if (!old_bit_value) { ibuf_bitmap_page_set_bits(bitmap_page, page_no, IBUF_BITMAP_BUFFERED, TRUE, &bitmap_mtr); }   mtr_commit(&bitmap_mtr); The above was introduced with the initial commit of InnoDB into MySQL 3.23.34 . If the server is killed or a backup is finished between the logical time of the commit of bitmap_mtr and the subsequent mini-transaction commit that inserts the record into the change buffer, then we will have the bitmap page indicating that there exist unbuffered changes for a page, although none might actually exist. I do not think that this non-atomicity can be fixed, so the change buffer merge will have to deal with this situation.

            Thanks a lot for the update Marko.

            Does that update also refer to https://jira.mariadb.org/browse/MDEV-22340 ?

            Thanks

            Bernardo Perez Bernardo Perez added a comment - Thanks a lot for the update Marko. Does that update also refer to https://jira.mariadb.org/browse/MDEV-22340 ? Thanks

            I think that MDEV-24449 is a rather likely cause of corrupting not only the change buffer, but also the system tablespace and any secondary index leaf page in user tables.

            marko Marko Mäkelä added a comment - I think that MDEV-24449 is a rather likely cause of corrupting not only the change buffer, but also the system tablespace and any secondary index leaf page in user tables.

            Unfortunately, an attempt to fix this corruption caused further corruption related to the change buffer in MariaDB Server 10.5 or later; see MDEV-25783.

            We recently reproduced this type of scenario in house, and we are working on a better fix.

            marko Marko Mäkelä added a comment - Unfortunately, an attempt to fix this corruption caused further corruption related to the change buffer in MariaDB Server 10.5 or later; see MDEV-25783 . We recently reproduced this type of scenario in house, and we are working on a better fix.

            I filed MDEV-30009 for a 10.5 regression where the slow shutdown hangs when this type of corruption is present.

            marko Marko Mäkelä added a comment - I filed MDEV-30009 for a 10.5 regression where the slow shutdown hangs when this type of corruption is present.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.