Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33553

mariabackup restore corrupts InnoDB tablespace IDs

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.11.7
    • 10.11
    • AlmaLinux 8.9

      Linux dbase05-west.meetmax.com 4.18.0-513.11.1.el8_9.x86_64 #1 SMP Wed Jan 17 02:00:40 EST 2024 x86_64 x86_64 x86_64 GNU/Linux

    Description

      Restoring mariabackup backups into a different database on the same server where the backup was taken results in InnoDB tablespace ID corruption within the restored database.

      All database and table names being used have been anonymized.

      mariabackup backup command:

      /usr/bin/stdbuf -o 10MB /bin/mariabackup --backup --slave-info --parallel=8 --user=root --target-dir=<backup dir>
      

      The files are rsynced to a new directory so the backup can be reused.

      mariabackup prepare command:

      /usr/bin/mariabackup --prepare --export --target-dir=<new dir>
      

      To do the restore, we perform the following actions:
      1. The target database is dropped and recreated.

      /usr/bin/mariadb-admin -f drop -h localhost our_database_staging --ssl
      /usr/bin/mariadb-admin create -h localhost our_database_staging --ssl
      

      2. The schema is recreated.

      /usr/bin/mariadb-dump --routines --skip-triggers --no-create-info --no-data --no-create-db --skip-opt -h localhost our_database_production --ssl | /usr/bin/mariadb --force -h localhost our_database_staging --ssl
      /usr/bin/mariadb-dump -C -d --skip-lock-tables -h localhost our_database_production --ssl | /usr/bin/mariadb --force -h localhost our_database_staging --ssl
      

      3. The tablespaces are discarded using a SQL file with an `ALTER TABLE ... DISCARD TABLESPACE;` line for each table.

      /usr/bin/mariadb -f -h localhost our_database_staging --ssl < /tmp/mariabackup-restore/discard_tablespace.sql
      

      4. All of the prepared backup files (*.idb, *.cfg, *.frm) are put into place with rsync.

      rsync -ogrvt /source/restore-prep/base/our_database_production/ /var/lib/mysql/our_database_staging/
      

      5. The tablespaces are then imported using one file per table each containing the relevant `ALTER TABLE ... IMPORT TABLESPACE;` line. https://mariadb.com/kb/en/innodb-file-per-table-tablespaces/#importing-transportable-tablespaces-for-non-partitioned-tables

      /usr/bin/mariadb -f -h localhost our_database_staging --ssl < /tmp/mariabackup-restore/import_tablespace-table_one.sql
      /usr/bin/mariadb -f -h localhost our_database_staging --ssl < /tmp/mariabackup-restore/import_tablespace-table_two.sql
      /usr/bin/mariadb -f -h localhost our_database_staging --ssl < /tmp/mariabackup-restore/import_tablespace-table_three.sql
      ...
      /usr/bin/mariadb -f -h localhost our_database_staging --ssl < /tmp/mariabackup-restore/import_tablespace-final_table.sql
      

      No errors are reported during the backup, prepare or restore.

      No errors are reported by `mariadb-check our_database_staging`; however, errors do show up in the journal (see mariadb-check.txt and mariadb_2024-02-27.txt, trimmed for size and modified to hide database/table names). Example errror: `[ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=3418816455, page number=34359], should be [page id: space=185601, page number=34359]`

      Querying `SELECT * FROM information_schema.INNODB_SYS_TABLES where name like 'our_database_staging/%';` shows the same tablespace IDs as the "should be" information reported in the journal.

      Attachments

        1. mariadb_2024-02-27.txt
          114 kB
        2. mariadb-check.txt
          0.6 kB
        3. my.cnf
          2 kB

        Activity

          mmusgrove Matthew Musgrove added a comment - - edited

          I have excluded the .frm files from the rsync and retested (--exclude '*.frm'). It made no difference; the "InnoDB: Space id and page no stored in the page..." errors are still logged when running mariadb-check on the newly restored database.

          mmusgrove Matthew Musgrove added a comment - - edited I have excluded the .frm files from the rsync and retested (--exclude '*.frm'). It made no difference; the "InnoDB: Space id and page no stored in the page..." errors are still logged when running mariadb-check on the newly restored database.

          Sorry, I had forgotten about this bug. Is this issue still repeatable with a newer release? MDEV-33512 was fixed in MariaDB Server 10.11.8.

          marko Marko Mäkelä added a comment - Sorry, I had forgotten about this bug. Is this issue still repeatable with a newer release? MDEV-33512 was fixed in MariaDB Server 10.11.8.

          I started testing this morning in some VMs. It's looking promising. I want to retest on the server that originally exhibited the issue tomorrow. I'll keep you posted on my testing efforts.

          mmusgrove Matthew Musgrove added a comment - I started testing this morning in some VMs. It's looking promising. I want to retest on the server that originally exhibited the issue tomorrow. I'll keep you posted on my testing efforts.

          When I run it in production, It doesn't look correct.

          Server version: 10.11.10-MariaDB-log MariaDB Server

          If I tail the journal while doing the imports, I see one line per table that looks something like this:
          Dec 19 10:16:29 example.example.com mariadbd[2478]: 2024-12-19 10:16:29 74489 [Note] InnoDB: trying to read page [page id: space=187144, page number=106149] in nonexisting or being-dropped tablespace

          After the import, if I tail the journal while running mariadb-check on the newly restored database, I see one line per table that looks something like this:
          Dec 19 10:36:39 example.example.com mariadbd[2478]: 2024-12-19 10:36:39 0 [ERROR] InnoDB: Space id and page no stored in the page, read in from ./db_name/table_name.ibd are [page id: space=3972460651, page number=1343], should be [page id: space=187313, page number=1343]

          mmusgrove Matthew Musgrove added a comment - When I run it in production, It doesn't look correct. Server version: 10.11.10-MariaDB-log MariaDB Server If I tail the journal while doing the imports, I see one line per table that looks something like this: Dec 19 10:16:29 example.example.com mariadbd [2478] : 2024-12-19 10:16:29 74489 [Note] InnoDB: trying to read page [page id: space=187144, page number=106149] in nonexisting or being-dropped tablespace After the import, if I tail the journal while running mariadb-check on the newly restored database, I see one line per table that looks something like this: Dec 19 10:36:39 example.example.com mariadbd [2478] : 2024-12-19 10:36:39 0 [ERROR] InnoDB: Space id and page no stored in the page, read in from ./db_name/table_name.ibd are [page id: space=3972460651, page number=1343] , should be [page id: space=187313, page number=1343]

          Thank you. The page number is not encrypted, but the tablespace ID is. It looks like something in the I/O layer is unaware that the tablespace should actually be encrypted and needs to be decrypted before validating the tablespace ID.

          marko Marko Mäkelä added a comment - Thank you. The page number is not encrypted, but the tablespace ID is. It looks like something in the I/O layer is unaware that the tablespace should actually be encrypted and needs to be decrypted before validating the tablespace ID.

          People

            thiru Thirunarayanan Balathandayuthapani
            mmusgrove Matthew Musgrove
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.