Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28779

ALTER TABLE IMPORT TABLESPACE fails with Data structure corruption

Details

    Description

      Attached is the MTR test that demonstrates the bug. Also a patch with a proposed fix.

      The problem is that during ALTER TABLE IMPORT TABLESPACE two things happen:

      • Regardless of whether the tablespace is encrypted, the current LSN (8 bytes) gets written at FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION
      • Then if the page is encrypted the first 4 bytes at that offset are being interpreted as key_version while the remainder as the CRC32 checksum. If the key_version is non-zero, CRC32 check of the page is performed (fil0import.cc:3599) . So if the LSN >= 1 << 32 we perform the check, which will most likely fail as we are comparing the lowest 4 bytes of the LSN against the computed CRC-32 checksum of the page thus resulting in the error message to the client.

      I believe the right thing to do is to not write the LSN at FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION for encrypted pages. At least it fixes the problem at hand.

      Attachments

        1. t2.opt
          0.1 kB
        2. t2.patch
          1 kB
        3. t2.test
          2 kB

        Issue Links

          Activity

            Thank you, spachev! I think that it is even simpler than that: the field should not be written at all. The history is:

            • On every page, InnoDB reserved 8 bytes at FIL_PAGE_FILE_FLUSH_LSN and excluded it from page checksums.
            • On shutdown, InnoDB used to update those 8 bytes in the first page of each file of the system tablespace, without using the doublewrite buffer or redo log. This risky write was removed (and this field lost all of its original meaning) in MDEV-27199.
            • Later, those 8 bytes were repurposed for various things, such as the SPATIAL INDEX R-tree "split sequence number" and the MariaDB encryption key ID. At that time, the logic was revised so that only the first page of the system tablespace would be updated on shutdown.
            • A conflict in the repurposing (inability to encrypt SPATIAL INDEX) motivated a file format change (innodb_checksum_algorithm=full_crc32, MDEV-12026).
            • In .ibd files, the field never had any special meaning.
            marko Marko Mäkelä added a comment - Thank you, spachev ! I think that it is even simpler than that: the field should not be written at all. The history is: On every page, InnoDB reserved 8 bytes at FIL_PAGE_FILE_FLUSH_LSN and excluded it from page checksums. On shutdown, InnoDB used to update those 8 bytes in the first page of each file of the system tablespace, without using the doublewrite buffer or redo log. This risky write was removed (and this field lost all of its original meaning) in MDEV-27199 . Later, those 8 bytes were repurposed for various things, such as the SPATIAL INDEX R-tree "split sequence number" and the MariaDB encryption key ID. At that time, the logic was revised so that only the first page of the system tablespace would be updated on shutdown. A conflict in the repurposing (inability to encrypt SPATIAL INDEX ) motivated a file format change ( innodb_checksum_algorithm=full_crc32 , MDEV-12026 ). In .ibd files, the field never had any special meaning.

            If I got it right, the time-consuming operations on the table db1.t1 in t2.test are necessary for advancing the LSN. I tried a different way of reproducing this, by extending an existing test that uses a special trick to advance the LSN:

            diff --git a/mysql-test/suite/mariabackup/huge_lsn.test b/mysql-test/suite/mariabackup/huge_lsn.test
            index 3349ef40df8..98aa73cec4f 100644
            --- a/mysql-test/suite/mariabackup/huge_lsn.test
            +++ b/mysql-test/suite/mariabackup/huge_lsn.test
            @@ -56,6 +56,12 @@ exec $XTRABACKUP  --prepare --target-dir=$targetdir;
             --source include/restart_and_restore.inc
             --enable_result_log
             SELECT * FROM t;
            +FLUSH TABLE t FOR EXPORT;
            +copy_file $_datadir/test/t.ibd $_datadir/test/t_copy.ibd;
            +UNLOCK TABLES;
            +ALTER TABLE t DISCARD TABLESPACE;
            +move_file $_datadir/test/t_copy.ibd $_datadir/test/t.ibd;
            +ALTER TABLE t IMPORT TABLESPACE;
             DROP TABLE t;
             rmdir $targetdir;
             let $targetdir= $targetdir_old;
            

            With or without the patch, the IMPORT TABLESPACE would crash:

            10.3 44ab6cba762e23a992e7c3bfd8f4319e56e08fe8

            mysqltest: At line 64: query 'ALTER TABLE t IMPORT TABLESPACE' failed: 2013: Lost connection to MySQL server during query
            mysqld: /mariadb/10.3/storage/innobase/fil/fil0crypt.cc:2696: bool fil_space_verify_crypt_checksum(const byte*, const page_size_t&): Assertion `mach_read_from_4(page + 26U)' failed.
            

            In the current 10.6, the same modification to the test would cause a different type of failure:

            10.6 892c426371b4be558d32fdeba7d1d56f46b40f2b

            mysqltest: At line 66: query 'ALTER TABLE t IMPORT TABLESPACE' failed: ER_NOT_KEYFILE (1034): Index for table 't' is corrupt; try to repair it
            

            I will try to fix that as well.

            marko Marko Mäkelä added a comment - If I got it right, the time-consuming operations on the table db1.t1 in t2.test are necessary for advancing the LSN. I tried a different way of reproducing this, by extending an existing test that uses a special trick to advance the LSN: diff --git a/mysql-test/suite/mariabackup/huge_lsn.test b/mysql-test/suite/mariabackup/huge_lsn.test index 3349ef40df8..98aa73cec4f 100644 --- a/mysql-test/suite/mariabackup/huge_lsn.test +++ b/mysql-test/suite/mariabackup/huge_lsn.test @@ -56,6 +56,12 @@ exec $XTRABACKUP --prepare --target-dir=$targetdir; --source include/restart_and_restore.inc --enable_result_log SELECT * FROM t; +FLUSH TABLE t FOR EXPORT; +copy_file $_datadir/test/t.ibd $_datadir/test/t_copy.ibd; +UNLOCK TABLES; +ALTER TABLE t DISCARD TABLESPACE; +move_file $_datadir/test/t_copy.ibd $_datadir/test/t.ibd; +ALTER TABLE t IMPORT TABLESPACE; DROP TABLE t; rmdir $targetdir; let $targetdir= $targetdir_old; With or without the patch, the IMPORT TABLESPACE would crash: 10.3 44ab6cba762e23a992e7c3bfd8f4319e56e08fe8 mysqltest: At line 64: query 'ALTER TABLE t IMPORT TABLESPACE' failed: 2013: Lost connection to MySQL server during query … mysqld: /mariadb/10.3/storage/innobase/fil/fil0crypt.cc:2696: bool fil_space_verify_crypt_checksum(const byte*, const page_size_t&): Assertion `mach_read_from_4(page + 26U)' failed. In the current 10.6, the same modification to the test would cause a different type of failure: 10.6 892c426371b4be558d32fdeba7d1d56f46b40f2b mysqltest: At line 66: query 'ALTER TABLE t IMPORT TABLESPACE' failed: ER_NOT_KEYFILE (1034): Index for table 't' is corrupt; try to repair it I will try to fix that as well.

            The problem with the above test is that DISCARD TABLESPACE forgets that the tablespace was default-encrypted. If I add explicit ENCRYPTED=YES to the CREATE TABLE statement, the file can be imported. Preserving the .cfg file did not help. This finally reproduced the problem as reported.

            marko Marko Mäkelä added a comment - The problem with the above test is that DISCARD TABLESPACE forgets that the tablespace was default-encrypted. If I add explicit ENCRYPTED=YES to the CREATE TABLE statement, the file can be imported. Preserving the .cfg file did not help. This finally reproduced the problem as reported.

            It turns out that as part of MDEV-21132 in 10.5.0 the function PageConverter::update_header() had already been rewritten to zero-initialize the field FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION in the first page.

            marko Marko Mäkelä added a comment - It turns out that as part of MDEV-21132 in 10.5.0 the function PageConverter::update_header() had already been rewritten to zero-initialize the field FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION in the first page.

            People

              marko Marko Mäkelä
              spachev Sasha Pachev
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.