Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34750

SET GLOBAL innodb_log_file_size is not crash safe

Details

    Description

      MDEV-33894 included an inadvertent change that causes SET GLOBAL innodb_log_file_size to write incorrect data to the being-resized log file (ib_logfile101). This does not affect the memory mapped log implementation on 64-bit Linux systems (when the log is stored in a mount -o dax file system or in /dev/shm). As far as I can tell, the fix should be this simple:

      diff --git a/storage/innobase/log/log0log.cc b/storage/innobase/log/log0log.cc
      index 93f8db6bfc0..84219dbc830 100644
      --- a/storage/innobase/log/log0log.cc
      +++ b/storage/innobase/log/log0log.cc
      @@ -804,7 +804,7 @@ void log_t::resize_write_buf(size_t length) noexcept
         }
       
         ut_a(os_file_write_func(IORequestWrite, "ib_logfile101", resize_log.m_file,
      -                          buf, offset, length) == DB_SUCCESS);
      +                          resize_flush_buf, offset, length) == DB_SUCCESS);
       }
       
       /** Write buf to ib_logfile0.
      

      Attachments

        Issue Links

          Activity

            Thanks to https://rr-project.org, I determined that the correct buffer to write to the ib_logfile101 is log_sys.resize_buf as it was at the start of log_t::write_buf(). Also, log_t::resize_start() must always use a resizing checkpoint target of the current LSN.

            Occasional debug assertion failures or messages related to the log sequence number being in the future are possible, because crash recovery failed to flag an error when the log cannot be scanned past the FILE_CHECKPOINT record.

            With all these fixed, I am still observing some occasional failures of the revised test innodb.log_file_size_online. I will continue debugging.

            marko Marko Mäkelä added a comment - Thanks to https://rr-project.org , I determined that the correct buffer to write to the ib_logfile101 is log_sys.resize_buf as it was at the start of log_t::write_buf() . Also, log_t::resize_start() must always use a resizing checkpoint target of the current LSN. Occasional debug assertion failures or messages related to the log sequence number being in the future are possible, because crash recovery failed to flag an error when the log cannot be scanned past the FILE_CHECKPOINT record. With all these fixed, I am still observing some occasional failures of the revised test innodb.log_file_size_online . I will continue debugging.

            The remaining failures occurred because my fix was rounding down the latest LSN to the log_sys.write_size, while it should have rounded it up. After this fix, SET GLOBAL innodb_log_file_size=… is more likely to be blocked until there have been further writes to InnoDB tables, because an entire new log block will have to be filled. In the worst case, this will require the LSN to grow by 4095 bytes.

            I filed MDEV-34802 for the lapse in recovery that allows InnoDB to start up on a corrupted log. It affects also older versions.

            marko Marko Mäkelä added a comment - The remaining failures occurred because my fix was rounding down the latest LSN to the log_sys.write_size , while it should have rounded it up. After this fix, SET GLOBAL innodb_log_file_size=… is more likely to be blocked until there have been further writes to InnoDB tables, because an entire new log block will have to be filled. In the worst case, this will require the LSN to grow by 4095 bytes. I filed MDEV-34802 for the lapse in recovery that allows InnoDB to start up on a corrupted log. It affects also older versions.

            origin/bb-10.11-MDEV-34802-MDEV-34750 bfaffdff6324953b46b92b04e68d405d33875d64 2024-08-28T07:51:32+03:00
            behaved well in RQG testing. No new problems.

            mleich Matthias Leich added a comment - origin/bb-10.11- MDEV-34802 - MDEV-34750 bfaffdff6324953b46b92b04e68d405d33875d64 2024-08-28T07:51:32+03:00 behaved well in RQG testing. No new problems.

            To avoid a regression where SET GLOBAL innodb_log_file_size would never finish, I revised the fix so that redundant FILE_CHECKPOINT records may be written in order to reach the resize target LSN.

            marko Marko Mäkelä added a comment - To avoid a regression where SET GLOBAL innodb_log_file_size would never finish, I revised the fix so that redundant FILE_CHECKPOINT records may be written in order to reach the resize target LSN.

            Thanks marko, the patch looks good.

            debarun Debarun Banerjee added a comment - Thanks marko , the patch looks good.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.