Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25026

Various code paths are accessing freed pages

    XMLWordPrintable

Details

    Description

      The test case encryption.innodb_encrypt_freed is failing in MemorySanitizer builds:

      CURRENT_TEST: encryption.innodb_encrypt_freed
      mysqltest: At line 59: query 'drop table t1, t2' failed: 2013: Lost connection to MySQL server during query
      ...
      #9  0x0000561a7ce5dd99 in __msan_warning_noreturn () at mysqld.cc:4359
      #10 0x0000561a7ff4c9a1 in fil_crypt_rotate_page (key_state=<optimized out>, state=<optimized out>) at fil/fil0crypt.cc:1804
      #11 fil_crypt_rotate_pages (key_state=<optimized out>, state=<optimized out>) at fil/fil0crypt.cc:1927
      #12 fil_crypt_thread () at fil/fil0crypt.cc:2107
      

      This failure was initially observed on a 10.6-based branch. The cause of the failure is twofold: We were comparing the page contents before checking the status, but also recovery failed to flag the block as freed:

      diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
      index 67f8a2b5277..7c9d8af0859 100644
      --- a/storage/innobase/log/log0recv.cc
      +++ b/storage/innobase/log/log0recv.cc
      @@ -2421,6 +2421,7 @@ static void recv_recover_page(buf_block_t* block, mtr_t& mtr,
       		any buffered changes. */
       		init->created = false;
       		ut_ad(!mtr.has_modifications());
      +		block->page.status = buf_page_t::FREED;
       	}
       
       	/* Make sure that committing mtr does not change the modification
      

      There are a few other places in the code where the BUF_GET_POSSIBLY_FREED mode is being used without a proper check afterwards. After fixing fil_crypt_rotate_page(), both uses of BUF_GET_IF_IN_POOL are safe.

      Other affected functions are the following:

      • btr_cur_optimistic_latch_leaves() (when acquiring a latch on the preceding page)
      • fil_crypt_read_crypt_data(), fil_crypt_start_encrypting_space(), fil_crypt_flush_space() (in addition to fil_crypt_rotate_page())
      • xdes_get_descriptor_const() (might affect change buffer merges and encryption)
      • lock_rec_block_validate() (in debug builds only)

      All in all, these race conditions mostly affect encryption. The impact of the issue is unclear but probably minimal except for encryption. If a bogus page is being written by encryption, crash recovery could fail with similar symptoms as MDEV-24695. That would be 'caused' by MDEV-15528 and not affecting earlier major versions than 10.5.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.