Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27868

buf_pool.flush_list is in the wrong order

    XMLWordPrintable

Details

    Description

      In MDEV-27774, the log_sys.mutex and log_sys.flush_order_mutex were replaced with a shared log_sys.latch. This means that concurrently executing mtr_t::commit() may insert records to the buf_pool.flush_list roughly concurrently. Each insertion is still protected by buf_pool.flush_list_mutex.

      In buf_pool_t::insert_into_flush_list() we attempted to compensate for this by not unconditionally inserting blocks first in buf_pool.flush_list but by searching for an appropriate insert position. This compensation does not appear to be working at all times. With the following command, I was able to reproduce a crash on my system:

      ./mtr --parallel=auto --repeat=100 encryption.innodb_encryption_filekeys{,,,,}{,,,}
      

      10.8 8251a9fb93075a72074bd7fd10faee5165014b7f

      encryption.innodb_encryption_filekeys 'cbc,innodb' w1 [ 26 fail ]
              Test ended at 2022-02-17 10:28:31
       
      CURRENT_TEST: encryption.innodb_encryption_filekeys
       
       
      Server [mysqld.1 - pid: 488142, winpid: 488142, exit: 256] failed during test run
      Server log from this test:
      ----------SERVER LOG START-----------
      2022-02-17 10:28:30 104 [Note] InnoDB: Creating #1 encryption thread id 140526397404736 total threads 4.
      2022-02-17 10:28:30 104 [Note] InnoDB: Creating #2 encryption thread id 140526389012032 total threads 4.
      2022-02-17 10:28:30 104 [Note] InnoDB: Creating #3 encryption thread id 140526405797440 total threads 4.
      2022-02-17 10:28:30 104 [Note] InnoDB: Creating #4 encryption thread id 140526414190144 total threads 4.
      mariadbd: /mariadb/10.8/storage/innobase/buf/buf0flu.cc:2538: void buf_flush_validate_low(): Assertion `om == 1 || !bpage || __builtin_expect(recv_sys.recovery_on, (0)) || om >= bpage->oldest_modification()' failed.
      

      The assertion reports that buf_pool.flush_list is not ordered by buf_page_t::oldest_modification(), like it must be.

      The impact of this bug is that log checkpoints and thus crash recovery and backup may work incorrectly.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.