Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.6, 10.11, 11.4, 11.8
-
None
-
Not for Release Notes
Description
The function buf_dblwr_t::create() is using several mini-transactions to initialize the InnoDB doublewrite buffer. This would break crash recovery and backup when using the upcoming innodb_log_archive=ON format (MDEV-37949) and replaying the log from the very start:
|
MDEV-37949 |
2025-12-19 14:02:20 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=12288
|
2025-12-19 14:02:20 0 [Note] InnoDB: End of log at LSN=56790
|
2025-12-19 14:02:20 0 [Note] InnoDB: To recover: 331 pages
|
mariadbd: /mariadb/server/storage/innobase/log/log0recv.cc:3848: bool recv_sys_t::apply_batch(uint32_t, fil_space_t*&, buf_block_t*&, bool): Assertion `!buf_dblwr.is_inside(pages_it->first)' failed.
|
A debug assertion would fail during the decovery, because buf_dblwr::create() is actually emitting log records for initializing pages of the doublewrite buffer. This is completely unnecessary and would generate about 3·192=576 bytes of unnecessary log during the server bootstrap.
The solution is to not write any log for initializing any pages of the doublewrite buffer, only for allocating them in fseg_alloc_free_page_general(). The allocation metadata does need to be persistently written.
Furthermore, the following code should really have been removed in MDEV-24142 or as part of other refactoring in MariaDB Server 10.6:
- if (((i + 1) & 15) == 0) {
|
- /* rw_locks can only be recursively x-locked 2048 times. (on 32
|
- bit platforms, (lint) 0 - (X_LOCK_DECR * 2049) is no longer a
|
- negative number, and thus lock_word becomes like a shared lock).
|
- For 4k page size this loop will lock the fseg header too many
|
- times. Since this code is not done while any other threads are
|
- active, restart the MTR occasionally. */
|
- mtr.commit();
|
- mtr.start();
|
- trx_sys_block= buf_dblwr_trx_sys_get(&mtr);
|
- fseg_header= TRX_SYS_DOUBLEWRITE + TRX_SYS_DOUBLEWRITE_FSEG +
|
- trx_sys_block->page.frame;
|
- } |
The recursion limit was increased to 65535 in MDEV-24142. I checked that the only mtr_t::commit() during the revised buf_dblwr_t::create() that comprises several objects (the bulk of it is related to allocating single pages) would be the main mini-transaction, with 293 entries, as follows:
- MTR_MEMO_PAGE_X_MODIFY on page 5 in the system tablespace (the TRX_SYS page)
- MTR_MEMO_SPACE_X_LOCK on the system tablespace (fil_system.sys_space)
- MTR_MEMO_PAGE_SX_MODIFY on page 0 in the system tablespace (the allocation metadata)
- duplicated MTR_MEMO_PAGE_SX_FIX or MTR_MEMO_PAGE_SX_MODIFY entries for pages 0 and 5
Yes, the duplicated entries could have been removed in the spirit of MDEV-29835 or MDEV-35125, but they are not an issue, because we are nowhere close to the maximum recursion limit of 65535.
Attachments
Issue Links
- blocks
-
MDEV-31956 SSD based InnoDB buffer pool extension
-
- In Progress
-
-
MDEV-37949 Implement innodb_log_archive
-
- In Progress
-