Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34216

Possible corruption when shrinking the system tablespace on innodb_fast_shutdown=0

    XMLWordPrintable

Details

    Description

      This bug was found related to MDEV-34212.

      Some InnoDB tests, most notably, innodb.table_flags,64k, would occasionally fail. I am able to reproduce this locally on a MemorySanitizer build, some of the time. The attached file data.tar.xz is a copy of the data directory of a failed test in a 10.4 based branch that contains a fix of MDEV-34212.

      I was able to reproduce the problem with an even simpler test case:

      tf.test

      --source include/have_innodb.inc
      SELECT @@innodb_page_size;
      

      and the following options:

      tf.opt

      --innodb-undo-tablespaces=0 --innodb-page-size=64k --innodb-buffer-pool-size=20m
      

      I was able to produce an rr replay by repeatedly running the following on a MemorySanitizer build:

      while ./mtr --boot-rr --parallel=60 innodb.tf{,,,,,,,,,}{,}{,,}; do :; done
      

      In the rr replay trace, I concluded that the problem is that the last write of the undo page 50 is being discarded due to a condition in buf_page_t::flush():

        if (UNIV_UNLIKELY(lsn < space->get_create_lsn()))
        {
          ut_ad(space->purpose == FIL_TYPE_TABLESPACE);
          goto freed;
        }
      

      The bug seems to be a conflict between the supposedly final buffer pool flushing and fsp_system_tablespace_truncate(). Here are some stack traces, from the same thread:

      buf_page_t::flush()
      buf_do_flush_list_batch (max_n=2000, lsn=18446744073709551615)
      buf_flush_list_holding_mutex (max_n=max_n@entry=2000, lsn=lsn@entry=18446744073709551615)
      buf_flush_list (max_n=2000, lsn=lsn@entry=18446744073709551615)
      buf_flush_buffer_pool ()
      logs_empty_and_mark_files_at_shutdown ()
      innodb_shutdown ()
      innobase_end ()
      

      The system tablespace truncation had been invoked a little earlier in the same thread:

      fil_space_t::set_create_lsn (this=0x71100001b080, lsn=58443)
      mtr_t::commit_shrink (...)
      fsp_system_tablespace_truncate ()
      innobase_end () at /mariadb/11.4/storage/innobase/handler/ha_innodb.cc:4269
      

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.