Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27868

buf_pool.flush_list is in the wrong order

    XMLWordPrintable

    Details

      Description

      In MDEV-27774, the log_sys.mutex and log_sys.flush_order_mutex were replaced with a shared log_sys.latch. This means that concurrently executing mtr_t::commit() may insert records to the buf_pool.flush_list roughly concurrently. Each insertion is still protected by buf_pool.flush_list_mutex.

      In buf_pool_t::insert_into_flush_list() we attempted to compensate for this by not unconditionally inserting blocks first in buf_pool.flush_list but by searching for an appropriate insert position. This compensation does not appear to be working at all times. With the following command, I was able to reproduce a crash on my system:

      ./mtr --parallel=auto --repeat=100 encryption.innodb_encryption_filekeys{,,,,}{,,,}
      

      10.8 8251a9fb93075a72074bd7fd10faee5165014b7f

      encryption.innodb_encryption_filekeys 'cbc,innodb' w1 [ 26 fail ]
              Test ended at 2022-02-17 10:28:31
       
      CURRENT_TEST: encryption.innodb_encryption_filekeys
       
       
      Server [mysqld.1 - pid: 488142, winpid: 488142, exit: 256] failed during test run
      Server log from this test:
      ----------SERVER LOG START-----------
      2022-02-17 10:28:30 104 [Note] InnoDB: Creating #1 encryption thread id 140526397404736 total threads 4.
      2022-02-17 10:28:30 104 [Note] InnoDB: Creating #2 encryption thread id 140526389012032 total threads 4.
      2022-02-17 10:28:30 104 [Note] InnoDB: Creating #3 encryption thread id 140526405797440 total threads 4.
      2022-02-17 10:28:30 104 [Note] InnoDB: Creating #4 encryption thread id 140526414190144 total threads 4.
      mariadbd: /mariadb/10.8/storage/innobase/buf/buf0flu.cc:2538: void buf_flush_validate_low(): Assertion `om == 1 || !bpage || __builtin_expect(recv_sys.recovery_on, (0)) || om >= bpage->oldest_modification()' failed.
      

      The assertion reports that buf_pool.flush_list is not ordered by buf_page_t::oldest_modification(), like it must be.

      The impact of this bug is that log checkpoints and thus crash recovery and backup may work incorrectly.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.