Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38595

InnoDB doublewrite buffer creation generates unnecessary log

    XMLWordPrintable

Details

    • Not for Release Notes

    Description

      The function buf_dblwr_t::create() is using several mini-transactions to initialize the InnoDB doublewrite buffer. This would break crash recovery and backup when using the upcoming innodb_log_archive=ON format (MDEV-37949) and replaying the log from the very start:

      MDEV-37949

      2025-12-19 14:02:20 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=12288
      2025-12-19 14:02:20 0 [Note] InnoDB: End of log at LSN=56790
      2025-12-19 14:02:20 0 [Note] InnoDB: To recover: 331 pages
      mariadbd: /mariadb/server/storage/innobase/log/log0recv.cc:3848: bool recv_sys_t::apply_batch(uint32_t, fil_space_t*&, buf_block_t*&, bool): Assertion `!buf_dblwr.is_inside(pages_it->first)' failed.
      

      A debug assertion would fail during the decovery, because buf_dblwr::create() is actually emitting log records for initializing pages of the doublewrite buffer. This is completely unnecessary and would generate about 3·192=576 bytes of unnecessary log during the server bootstrap.

      The solution is to not write any log for initializing any pages of the doublewrite buffer, only for allocating them in fseg_alloc_free_page_general(). The allocation metadata does need to be persistently written.

      Furthermore, the following code should really have been removed in MDEV-24142 or as part of other refactoring in MariaDB Server 10.6:

      -    if (((i + 1) & 15) == 0) {
      -      /* rw_locks can only be recursively x-locked 2048 times. (on 32
      -      bit platforms, (lint) 0 - (X_LOCK_DECR * 2049) is no longer a
      -      negative number, and thus lock_word becomes like a shared lock).
      -      For 4k page size this loop will lock the fseg header too many
      -      times. Since this code is not done while any other threads are
      -      active, restart the MTR occasionally. */
      -      mtr.commit();
      -      mtr.start();
      -      trx_sys_block= buf_dblwr_trx_sys_get(&mtr);
      -      fseg_header= TRX_SYS_DOUBLEWRITE + TRX_SYS_DOUBLEWRITE_FSEG +
      -        trx_sys_block->page.frame;
      -    }
      

      The recursion limit was increased to 65535 in MDEV-24142. I checked that the only mtr_t::commit() during the revised buf_dblwr_t::create() that comprises several objects (the bulk of it is related to allocating single pages) would be the main mini-transaction, with 293 entries, as follows:

      • MTR_MEMO_PAGE_X_MODIFY on page 5 in the system tablespace (the TRX_SYS page)
      • MTR_MEMO_SPACE_X_LOCK on the system tablespace (fil_system.sys_space)
      • MTR_MEMO_PAGE_SX_MODIFY on page 0 in the system tablespace (the allocation metadata)
      • duplicated MTR_MEMO_PAGE_SX_FIX or MTR_MEMO_PAGE_SX_MODIFY entries for pages 0 and 5

      Yes, the duplicated entries could have been removed in the spirit of MDEV-29835 or MDEV-35125, but they are not an issue, because we are nowhere close to the maximum recursion limit of 65535.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Marko Mäkelä Marko Mäkelä
              Thirunarayanan Balathandayuthapani Thirunarayanan Balathandayuthapani
              Saahil Alam Saahil Alam
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.