Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34216

Possible corruption when shrinking the system tablespace on innodb_fast_shutdown=0

Details

    Description

      This bug was found related to MDEV-34212.

      Some InnoDB tests, most notably, innodb.table_flags,64k, would occasionally fail. I am able to reproduce this locally on a MemorySanitizer build, some of the time. The attached file data.tar.xz is a copy of the data directory of a failed test in a 10.4 based branch that contains a fix of MDEV-34212.

      I was able to reproduce the problem with an even simpler test case:

      tf.test

      --source include/have_innodb.inc
      SELECT @@innodb_page_size;
      

      and the following options:

      tf.opt

      --innodb-undo-tablespaces=0 --innodb-page-size=64k --innodb-buffer-pool-size=20m
      

      I was able to produce an rr replay by repeatedly running the following on a MemorySanitizer build:

      while ./mtr --boot-rr --parallel=60 innodb.tf{,,,,,,,,,}{,}{,,}; do :; done
      

      In the rr replay trace, I concluded that the problem is that the last write of the undo page 50 is being discarded due to a condition in buf_page_t::flush():

        if (UNIV_UNLIKELY(lsn < space->get_create_lsn()))
        {
          ut_ad(space->purpose == FIL_TYPE_TABLESPACE);
          goto freed;
        }
      

      The bug seems to be a conflict between the supposedly final buffer pool flushing and fsp_system_tablespace_truncate(). Here are some stack traces, from the same thread:

      buf_page_t::flush()
      buf_do_flush_list_batch (max_n=2000, lsn=18446744073709551615)
      buf_flush_list_holding_mutex (max_n=max_n@entry=2000, lsn=lsn@entry=18446744073709551615)
      buf_flush_list (max_n=2000, lsn=lsn@entry=18446744073709551615)
      buf_flush_buffer_pool ()
      logs_empty_and_mark_files_at_shutdown ()
      innodb_shutdown ()
      innobase_end ()
      

      The system tablespace truncation had been invoked a little earlier in the same thread:

      fil_space_t::set_create_lsn (this=0x71100001b080, lsn=58443)
      mtr_t::commit_shrink (...)
      fsp_system_tablespace_truncate ()
      innobase_end () at /mariadb/11.4/storage/innobase/handler/ha_innodb.cc:4269
      

      Attachments

        Issue Links

          Activity

            thiru suggested the following fix:

            diff --git a/storage/innobase/mtr/mtr0mtr.cc b/storage/innobase/mtr/mtr0mtr.cc
            index 90a2007a48d..8db52ac1f47 100644
            --- a/storage/innobase/mtr/mtr0mtr.cc
            +++ b/storage/innobase/mtr/mtr0mtr.cc
            @@ -583,8 +583,9 @@ void mtr_t::commit_shrink(fil_space_t &space, uint32_t size)
             
               if (space.id == TRX_SYS_SPACE)
                 srv_sys_space.set_last_file_size(file->size);
            +  else
            +    space.set_create_lsn(m_commit_lsn);
             
            -  space.set_create_lsn(m_commit_lsn);
               mysql_mutex_unlock(&fil_system.mutex);
             
               space.clear_freed_ranges();
            

            This is obvious and should have been caught in the MDEV-14795 review already.

            The purpose of fil_space_t::create_lsn is to denote the logical time when the entire tablespace was rewritten or created. It is supposed to be used on the undo tablespaces only. The system tablespace is newer re-created, but its size may be reduced.

            With this patch, my MSAN based test does not fail. I will let it run for a few more minutes, because the failure was sporadic.

            marko Marko Mäkelä added a comment - thiru suggested the following fix: diff --git a/storage/innobase/mtr/mtr0mtr.cc b/storage/innobase/mtr/mtr0mtr.cc index 90a2007a48d..8db52ac1f47 100644 --- a/storage/innobase/mtr/mtr0mtr.cc +++ b/storage/innobase/mtr/mtr0mtr.cc @@ -583,8 +583,9 @@ void mtr_t::commit_shrink(fil_space_t &space, uint32_t size) if (space.id == TRX_SYS_SPACE) srv_sys_space.set_last_file_size(file->size); + else + space.set_create_lsn(m_commit_lsn); - space.set_create_lsn(m_commit_lsn); mysql_mutex_unlock(&fil_system.mutex); space.clear_freed_ranges(); This is obvious and should have been caught in the MDEV-14795 review already. The purpose of fil_space_t::create_lsn is to denote the logical time when the entire tablespace was rewritten or created. It is supposed to be used on the undo tablespaces only. The system tablespace is newer re-created, but its size may be reduced. With this patch, my MSAN based test does not fail. I will let it run for a few more minutes, because the failure was sporadic.

            A test for the above patch has been running on my 11.4 based branch since I wrote the previous message. 60 tests in parallel, 16 seconds per test batch because it includes creating the test harness from the scratch. It used to fail within the first 10 attempts. Now there must have been more than 70 test rounds.

            marko Marko Mäkelä added a comment - A test for the above patch has been running on my 11.4 based branch since I wrote the previous message. 60 tests in parallel, 16 seconds per test batch because it includes creating the test harness from the scratch. It used to fail within the first 10 attempts. Now there must have been more than 70 test rounds.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.