Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19348

MariaBackup prepare fails with InnoDB: Database page corruption on disk or a failed file read

Details

    Description

      Note: Run the test with --repeat=N in addition to options mentioned below. N=5 is usually enough for me, but it can vary on different machines and builds. IMPORTANT: I didn't get the failure on an ASAN build within a limited amount of attempts, while non-ASAN debug and non-debug builds were failing rather readily. Possibly dynamics in ASAN builds are different.

      10.2 4e01bc8c

      ...
      2019-04-26 22:27:27 139757398472128 [Note] InnoDB: Ignoring data file './test/#sql-7c3a_9.ibd' with space ID 14. Another data file called test/t6.ibd exists with the same space ID.
      2019-04-26 22:27:27 139757398472128 [Note] InnoDB: Starting final batch to recover 361 pages from redo log.
      2019-04-26 22:27:27 139757197797120 [ERROR] InnoDB: Database page corruption on disk or a failed file read of tablespace test/t6 page [page id: space=14, page number=3]. You may have to recover from a backup.
      2019-04-26 22:27:27 139757197797120 [Note] InnoDB: Page dump in ascii and hex (4096 bytes):
      ...
      InnoDB: End of page dump
      2019-04-26 22:27:27 139757197797120 [Note] InnoDB: Uncompressed page, stored checksum in field1 3070644445, calculated checksums for field1: crc32 75358379, innodb 887337880,  page type 17855 == INDEX.none 3735928559, stored checksum in field2 3070644445, calculated checksums for field2: crc32 75358379, innodb 2072860611, none 3735928559,  page LSN 0 1102123, low 4 bytes of LSN at page end 1102123, page number (if stored to page already) 3, space id (if created with >= MySQL-4.1.1 and stored already) 14
      2019-04-26 22:27:27 139757197797120 [Note] InnoDB: Page may be an index page where index id is 9896656056595753379
      2019-04-26 22:27:27 139757197797120 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error. If the corrupt page is an index page. You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can use CHECK TABLE to scan your table for corruption. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
      2019-04-26 22:27:27 139757197797120 [ERROR] InnoDB: Failed to read file 'test/t6.ibd' at offset 3: Page read from tablespace is corrupted.
      2019-04-26 22:27:27 139757398472128 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
      2019-04-26 22:27:27 139757398472128 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2228] with error Data structure corruption
      [00] FATAL ERROR: 2019-04-26 22:27:28 mariabackup: innodb_init() returned 39 (Data structure corruption).
      

      Put the test case below under suite/mariabackup (so that it could find mariabackup) and run with extra options:

      --mysqld=--innodb-encrypt-tables --mysqld=--innodb-encryption-threads=1  --mysqld=--file-key-management --mysqld=--file-key-management-filename=`pwd`/std_data/keys.txt --mysqld=--plugin-load-add=file_key_management --mysqld=--innodb-page-size=4K  --mysqld=--innodb_change_buffering=none --force-restart --repeat=10
      

      innodb_page_size=4K is not strictly necessary, but it seems to make failure more frequent (could be a coincidence);

      --source include/have_innodb.inc
      SET @stats.persistent= @@innodb_stats_persistent;
      SET GLOBAL innodb_stats_persistent= ON;
       
      CREATE TABLE t1 ( 
        pk SERIAL,
        PRIMARY KEY(pk)
      ) ENGINE=InnoDB;
       
      CREATE TABLE t2 ( 
        pk INT,
        PRIMARY KEY(pk)
      ) ENGINE=InnoDB;
       
      CREATE TABLE t3 ( 
        col_year YEAR NOT NULL DEFAULT '1970',
        pk INT,
        PRIMARY KEY(pk)
      ) ENGINE=InnoDB;
      ALTER TABLE t3 ADD INDEX(col_year);
       
      CREATE TABLE t4 ( 
        col_timestamp TIMESTAMP(5) NULL,
        pk INT,
        col_varchar VARCHAR(27) NULL,
        PRIMARY KEY(pk)
      ) ENGINE=InnoDB;
      ALTER TABLE t4 ADD INDEX(col_varchar(27));
       
      CREATE TABLE t5 ( 
        col_year YEAR NULL,
        col_int SMALLINT(35) NULL,
        col_varchar VARCHAR(2817) NULL,
        col_blob TEXT NULL,
        col_time TIME NOT NULL DEFAULT '00:00:00',
        col_timestamp TIMESTAMP(6) NOT NULL DEFAULT '1971-01-01 00:00:00',
        col_date DATE NULL,
        col_char CHAR(170) NOT NULL DEFAULT '',
        col_dec DECIMAL(16,0) UNSIGNED NOT NULL DEFAULT 0,
        pk INT,
        col_datetime DATETIME(5) NULL,
        col_enum ENUM('','a','b','c','d','e','f','foo','bar') NULL,
        PRIMARY KEY(pk)
      ) ENGINE=InnoDB;
      ALTER TABLE t5 ADD INDEX(col_dec);
      ALTER TABLE t5 ADD UNIQUE(col_date);
      ALTER TABLE t5 ADD INDEX(pk);
      ALTER TABLE t5 ADD INDEX(col_time);
       
      CREATE TABLE t9 ( 
        col_year YEAR NOT NULL DEFAULT '1970',
        col_timestamp TIMESTAMP(1) NULL,
        col_date DATE NULL,
        col_enum SET('','a','b','c','d','e','f','foo','bar') NULL,
        col_int SMALLINT(13) NULL,
        col_varchar VARCHAR(3450) NOT NULL DEFAULT '',
        col_time TIME(6) NULL,
        pk INT,
        col_datetime DATETIME(4) NULL,
        col_char BINARY(141) NOT NULL DEFAULT '',
        col_bit BIT(17) NOT NULL DEFAULT 0,
        col_dec DECIMAL(31,0) UNSIGNED ZEROFILL NULL,
        col_blob BLOB NOT NULL DEFAULT '',
        PRIMARY KEY(pk)
      ) ENGINE=InnoDB;
      ALTER TABLE t9 ADD INDEX(col_int);
      ALTER TABLE t9 ADD INDEX(col_char(64));
      ALTER TABLE t9 ADD INDEX(pk);
      ALTER TABLE t9 ADD UNIQUE(col_bit);
       
      CREATE OR REPLACE TABLE t4 (a INT) ENGINE=InnoDB;
      CREATE OR REPLACE TABLE t6 (b INT) ENGINE=InnoDB;
      INSERT INTO t4 () VALUES ();
      INSERT INTO t6 () VALUES (),(),(),();
       
      ALTER TABLE t5 ORDER BY tscol7;
      ALTER TABLE t9 FORCE;
      OPTIMIZE TABLE t6;
       
      --echo # Running backup...
      --exec $XTRABACKUP --backup --target-dir=$MYSQL_TMP_DIR/backup --protocol=tcp --port=$MASTER_MYPORT --user=root > $MYSQL_TMP_DIR/backup.log 2>&1
      --exec cp -r $MYSQL_TMP_DIR/backup $MYSQL_TMP_DIR/backup_before_prepare
      --echo # Running prepare...
      --exec sh -c " if ! $XTRABACKUP --prepare --target-dir=$MYSQL_TMP_DIR/backup --user=root > $MYSQL_TMP_DIR/prepare.log 2>&1 || grep 'Data structure corruption' $MYSQL_TMP_DIR/prepare.log ; then cat $MYSQL_TMP_DIR/prepare.log ; exit 1 ; fi"
      --echo # Prepare succeeded
       
      DROP TABLE t1, t2, t3, t4, t5, t6, t9;
      --rmdir $MYSQL_TMP_DIR/backup
      --rmdir $MYSQL_TMP_DIR/backup_before_prepare
       
      SET GLOBAL innodb_stats_persistent= @stats.persistent;
      

      Once a "problematic" backup is produced, the failure on prepare seems to be reliably reproducible (thus the test stores the backup directory before running prepare).

      If you can't reproduce the failure with the test case, I can provide the backup data.

      Attachments

        Issue Links

          Activity

            It looks pretty much like MDEV-18193, I don't remember why it was filed separately.
            See also an alternative RQG-based test in MDEV-18193, if the provided MTR is not enough.

            elenst Elena Stepanova added a comment - It looks pretty much like MDEV-18193 , I don't remember why it was filed separately. See also an alternative RQG-based test in MDEV-18193 , if the provided MTR is not enough.

            I can able to repeat the issue without mariabackup. (kill and start the server) MLOG_INDEX_LOAD ensures that only index pages are flushed to disk. It doesn't ensure about page 0, page 1, page 2. After the restart, page 0 doesn't have crypt info. It leads to failure of decryption of page 3. Workaround can be innodb_log_optimize_ddl = 0 as of now.

            thiru Thirunarayanan Balathandayuthapani added a comment - I can able to repeat the issue without mariabackup. (kill and start the server) MLOG_INDEX_LOAD ensures that only index pages are flushed to disk. It doesn't ensure about page 0, page 1, page 2. After the restart, page 0 doesn't have crypt info. It leads to failure of decryption of page 3. Workaround can be innodb_log_optimize_ddl = 0 as of now.

            Another variation (to make it searchable):

            2019-08-01  8:04:33 0 [Note] InnoDB: Starting final batch to recover 240 pages from redo log.
            2019-08-01  8:04:33 0 [ERROR] InnoDB: Encrypted page [page id: space=502, page number=3] in file test/t6.ibd looks corrupted; key_version=4114489502
            2019-08-01  8:04:33 0 [ERROR] InnoDB: Failed to read file 'test/t6.ibd' at offset 3: Table is encrypted but decrypt failed.
            2019-08-01  8:04:33 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
            2019-08-01  8:04:33 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
            [00] FATAL ERROR: 2019-08-01 08:04:33 mariabackup: innodb_init() returned 39 (Data structure corruption).
            

            elenst Elena Stepanova added a comment - Another variation (to make it searchable): 2019-08-01 8:04:33 0 [Note] InnoDB: Starting final batch to recover 240 pages from redo log. 2019-08-01 8:04:33 0 [ERROR] InnoDB: Encrypted page [page id: space=502, page number=3] in file test/t6.ibd looks corrupted; key_version=4114489502 2019-08-01 8:04:33 0 [ERROR] InnoDB: Failed to read file 'test/t6.ibd' at offset 3: Table is encrypted but decrypt failed. 2019-08-01 8:04:33 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages. 2019-08-01 8:04:33 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption [00] FATAL ERROR: 2019-08-01 08:04:33 mariabackup: innodb_init() returned 39 (Data structure corruption).

            I think that this needs a little more work.
            Could you also please try to spend the effort to create a test case?

            As far as I understand, innodb_log_optimize_ddl=OFF is a viable workaround.

            marko Marko Mäkelä added a comment - I think that this needs a little more work. Could you also please try to spend the effort to create a test case? As far as I understand, innodb_log_optimize_ddl=OFF is a viable workaround.
            kavehazizi Kaveh Azizi added a comment -

            Hi all,
            Could you please let me know if there is any remedy for this. I am planning on a upgrade and it is important if i can roll back my db or not?
            Thanks.

            kavehazizi Kaveh Azizi added a comment - Hi all, Could you please let me know if there is any remedy for this. I am planning on a upgrade and it is important if i can roll back my db or not? Thanks.
            danblack Daniel Black added a comment -

            > Could you please let me know if there is any remedy for this.

            Note the fixed versions.

            > I am planning on a upgrade and it is important if i can roll back my db or not?

            Test your backups before you need them.

            kavehazizi this JIRA is bug reporting area for server bugs that need to be fixed. It can't work as a general support forum.

            danblack Daniel Black added a comment - > Could you please let me know if there is any remedy for this. Note the fixed versions. > I am planning on a upgrade and it is important if i can roll back my db or not? Test your backups before you need them. kavehazizi this JIRA is bug reporting area for server bugs that need to be fixed. It can't work as a general support forum.

            People

              thiru Thirunarayanan Balathandayuthapani
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.