[MDEV-15528] Avoid writing freed InnoDB pages Created: 2018-03-09  Updated: 2023-12-22  Resolved: 2020-03-10

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1.3, 10.2.0, 10.3.0
Fix Version/s: 10.5.2

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: performance

Issue Links:
Blocks
blocks MDEV-12227 Defer writes to the InnoDB temporary ... Closed
blocks MDEV-17596 Assertion `block->page.flush_observer... Closed
blocks MDEV-18724 Replace buf_block_t::mutex with more ... Closed
PartOf
Problem/Incident
causes MDEV-22096 Mariabackup copied too old page or to... Closed
causes MDEV-22097 Not applying DELETE_ROW_FORMAT_REDUND... Closed
causes MDEV-22103 INNODB_ENCRYPTION_NUM_KEY_REQUESTS is... Closed
causes MDEV-22139 fseg_free_page_low() fails to write F... Closed
causes MDEV-22169 Recovery fails after failing to inser... Closed
causes MDEV-22495 Assertion mode == 16 || mode == 12 ||... Closed
causes MDEV-22710 Assertion `mode == 16 || mode == 12 |... Closed
causes MDEV-23252 Assertion failure 'req_type.is_dblwr_... Closed
causes MDEV-24695 Encryption is modifying a freed page Closed
causes MDEV-27500 buf_page_free() fails to drop the ada... Closed
causes MDEV-27985 buf_flush_freed_pages() causes InnoDB... Closed
causes MDEV-30404 Inconsistent updates of PAGE_MAX_TRX_... Closed
causes MDEV-30438 innodb.undo_truncate,4k fails when --... Closed
causes MDEV-32552 Write-ahead logging is broken for fre... Closed
Relates
relates to MDEV-11696 Page Compression Has No Effect on Tab... Closed
relates to MDEV-16526 Overhaul the InnoDB page flushing Closed
relates to MDEV-17380 innodb_flush_neighbors=ON should be i... Closed
relates to MDEV-21952 ibdata1 file size growing in MariaDB Closed
relates to MDEV-23973 Change buffer corruption when realloc... Closed
relates to MDEV-24569 Assertion `mach_read_from_4(frame + 4... Closed
relates to MDEV-25026 Various code paths are accessing free... Closed
relates to MDEV-28699 Shrink temporary tablespaces without ... Closed
relates to MDEV-31816 buf_LRU_free_page() does not preserve... Closed
relates to MDEV-33112 innodb_undo_log_truncate=ON is blocki... Closed
relates to MDEV-8139 Fix scrubbing Closed
relates to MDEV-11068 Review which innodb_compression_algor... Closed
relates to MDEV-12226 Avoid writes of freed (garbage) pages... Closed
relates to MDEV-12699 Improve crash recovery of corrupted d... Closed
relates to MDEV-15949 InnoDB: Failing assertion: space->n_p... Closed
relates to MDEV-16796 TRUNCATE TABLE slowdown with innodb_f... Closed
relates to MDEV-17380 innodb_flush_neighbors=ON should be i... Closed
relates to MDEV-18698 Show InnoDB's internal background thr... Open
relates to MDEV-20813 Assertion `!srv_safe_truncate || !new... Closed
relates to MDEV-22190 After IMPORT: InnoDB: Record 126 is a... Closed
relates to MDEV-22839 ROW_FORMAT=COMPRESSED vs PAGE_COMPRES... Open
relates to MDEV-24780 [FATAL] InnoDB: Trying to read page n... Closed

 Description   

When a InnoDB data file page is freed, its contents becomes garbage, and any storage allocated in the data file is wasted.

MariaDB 10.4 introduced an InnoDB redo log record MLOG_INIT_FREE_PAGE for marking pages as freed. In MDEV-12353 (MariaDB 10.5.2), that record was replaced with FREE_PAGE. This record could be treated as no-op, or we can punch a hole for page_compressed=1 tables.

If innodb_immediate_scrub_data_uncompressed is set, we should initialize the page with zeros. This will replace some of the non-working scrubbing logic (MDEV-8139). The scrubbing will be fixed further in MDEV-8139.

The following parameters will be deprecated and ignored and the problematic ‘background scrubbing’ code removed:

  • innodb-background-scrub-data-uncompressed
  • innodb-background-scrub-data-compressed
  • innodb-background-scrub-data-interval
  • innodb-background-scrub-data-check-interval

For page_compressed tables the freed page will be hole-punched



 Comments   
Comment by Marko Mäkelä [ 2019-02-08 ]

MariaDB 10.4.3 will introduce a MLOG_INIT_FREE_PAGE record for this purpose. This allows this bug to be fixed in MariaDB 10.4 later without breaking crash-downgrade to earlier MariaDB 10.4 versions.

In MDEV-12699, MLOG_INIT_FREE_PAGE should be handled in a similar way as the existing record MLOG_INIT_FILE_PAGE2.

Comment by Marko Mäkelä [ 2019-04-08 ]

I think that this is realistic to do in the 10.4 time frame. Assigning to myself, because this is closely related to MDEV-12699, which I am working on.

Comment by Marko Mäkelä [ 2019-04-18 ]

When scrubbing is not enabled and the page is not on SSD and the file is not page_compressed, we can discard the page from the buf_pool->flush_list once the MLOG_INIT_FREE_PAGE record has been written. For scrubbing, it is kind of mandatory to initialize the page.

For non-compressed tables on SSD, we will have to evaluate whether punching holes is going to improve or hurt performance on the average.

Comment by Marko Mäkelä [ 2020-01-27 ]

As part of this, we should simplify the freeing of B-tree root pages:

diff --git a/storage/innobase/btr/btr0btr.cc b/storage/innobase/btr/btr0btr.cc
index 753ed9b077c..21609edb7f3 100644
--- a/storage/innobase/btr/btr0btr.cc
+++ b/storage/innobase/btr/btr0btr.cc
@@ -961,12 +961,9 @@ constexpr index_id_t	BTR_FREED_INDEX_ID = 0;
 
 /** Free a B-tree root page. btr_free_but_not_root() must already
 have been called.
-In a persistent tablespace, the caller must invoke fsp_init_file_page()
-before mtr.commit().
 @param[in,out]	block		index root page
-@param[in,out]	mtr		mini-transaction
-@param[in]	invalidate	whether to invalidate PAGE_INDEX_ID */
-static void btr_free_root(buf_block_t *block, mtr_t *mtr, bool invalidate)
+@param[in,out]	mtr		mini-transaction */
+static void btr_free_root(buf_block_t *block, mtr_t *mtr)
 {
   ut_ad(mtr_memo_contains_flagged(mtr, block,
 				  MTR_MEMO_PAGE_X_FIX | MTR_MEMO_PAGE_SX_FIX));

(Obviously, also the code inside if (invalidate) should be removed and the callers be adjusted.)
We should ensure that MLOG_INIT_FREE_PAGE records will be emitted by fseg_free_step(), and thus no log needs to be written in this function.

Comment by Marko Mäkelä [ 2020-03-09 ]

This looks OK to me, after addressing my feedback for commit 1 of 2. We will address the following later, ideally before the 10.5 GA release:

  • clean up btr_free_root() as I suggested in my previous comment
  • address the FIXME comments in buf_page_free() by introducing a data structure that covers freed page ranges also for pages that are not in the buffer pool, and enable the scrubbing tests (MDEV-8139)
Comment by Thirunarayanan Balathandayuthapani [ 2020-03-10 ]

commit a5584b13d1e04f38b843602413669591aa65c359 (HEAD -> 10.5, origin/10.5, 10.5-work)
Author: Thirunarayanan Balathandayuthapani <thiru@mariadb.com>
Date:   Tue Mar 10 10:47:24 2020 +0530
 
commit a35b4ae89871d8184f04abb112c385481d557dbb
Author: Thirunarayanan Balathandayuthapani <thiru@mariadb.com>
Date:   Mon Mar 9 13:25:33 2020 +0530
 
    MDEV-15528 Punch holes when pages are freed

Patch has been pushed to 10.5. Thanks to marko and @matthias.leich for testing abd reviewing it

Generated at Thu Feb 08 08:22:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.