[MDEV-25026] Various code paths are accessing freed pages Created: 2021-03-02  Updated: 2021-03-02  Resolved: 2021-03-02

Status: Closed
Project: MariaDB Server
Component/s: Encryption, Storage Engine - InnoDB
Affects Version/s: 10.5
Fix Version/s: 10.5.10

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: MSAN, corruption

Issue Links:
Blocks
Relates
relates to MDEV-15528 Avoid writing freed InnoDB pages Closed
relates to MDEV-24695 Encryption is modifying a freed page Closed

 Description   

The test case encryption.innodb_encrypt_freed is failing in MemorySanitizer builds:

CURRENT_TEST: encryption.innodb_encrypt_freed
mysqltest: At line 59: query 'drop table t1, t2' failed: 2013: Lost connection to MySQL server during query
...
#9  0x0000561a7ce5dd99 in __msan_warning_noreturn () at mysqld.cc:4359
#10 0x0000561a7ff4c9a1 in fil_crypt_rotate_page (key_state=<optimized out>, state=<optimized out>) at fil/fil0crypt.cc:1804
#11 fil_crypt_rotate_pages (key_state=<optimized out>, state=<optimized out>) at fil/fil0crypt.cc:1927
#12 fil_crypt_thread () at fil/fil0crypt.cc:2107

This failure was initially observed on a 10.6-based branch. The cause of the failure is twofold: We were comparing the page contents before checking the status, but also recovery failed to flag the block as freed:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index 67f8a2b5277..7c9d8af0859 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -2421,6 +2421,7 @@ static void recv_recover_page(buf_block_t* block, mtr_t& mtr,
 		any buffered changes. */
 		init->created = false;
 		ut_ad(!mtr.has_modifications());
+		block->page.status = buf_page_t::FREED;
 	}
 
 	/* Make sure that committing mtr does not change the modification

There are a few other places in the code where the BUF_GET_POSSIBLY_FREED mode is being used without a proper check afterwards. After fixing fil_crypt_rotate_page(), both uses of BUF_GET_IF_IN_POOL are safe.

Other affected functions are the following:

  • btr_cur_optimistic_latch_leaves() (when acquiring a latch on the preceding page)
  • fil_crypt_read_crypt_data(), fil_crypt_start_encrypting_space(), fil_crypt_flush_space() (in addition to fil_crypt_rotate_page())
  • xdes_get_descriptor_const() (might affect change buffer merges and encryption)
  • lock_rec_block_validate() (in debug builds only)

All in all, these race conditions mostly affect encryption. The impact of the issue is unclear but probably minimal except for encryption. If a bogus page is being written by encryption, crash recovery could fail with similar symptoms as MDEV-24695. That would be 'caused' by MDEV-15528 and not affecting earlier major versions than 10.5.


Generated at Thu Feb 08 09:34:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.