Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25854

Restoring a backup may result in garbage intermediate tables from DDL

Details

    Description

      MDEV-17567 and MDEV-25180 changed something in InnoDB startup. We forgot to adjust Mariabackup accordingly.

      Specifically, the command mariabackup --prepare --export is supposed to purge the history of old transactions, and also apply all logs.

      Starting with MDEV-25180, InnoDB no longer treats specially any tables whose names start with #sql. The only exception are table names that start with #sql-ib. It is the responsibility of the SQL layer's DDL log recovery to drop any such tables, in case the server was killed (or a backup was made) while a table-rebuilding DDL operation was in progress.

      The following patch proves that a regression was introduced. If the function (which would start the InnoDB purge of transaction history) would be invoked during mariabackup --prepare --export, the execution would be aborted due to assertion failure.

      diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
      index db5c8524776..73f7a1aa248 100644
      --- a/storage/innobase/handler/ha_innodb.cc
      +++ b/storage/innobase/handler/ha_innodb.cc
      @@ -1954,6 +1954,7 @@ static int innodb_check_version(handlerton *hton, const char *path,
       static void innodb_ddl_recovery_done(handlerton*)
       {
         ut_ad(!ddl_recovery_done);
      +  ut_a(srv_operation != SRV_OPERATION_RESTORE_EXPORT);
         ut_d(ddl_recovery_done= true);
         if (!srv_read_only_mode && srv_operation == SRV_OPERATION_NORMAL &&
             srv_force_recovery < SRV_FORCE_NO_BACKGROUND)
      

      We must ensure that ha_signal_ddl_recovery_done() will be invoked. Should it for some reason be unacceptable to invoke ddl_log_execute_recovery() in mariabackup --prepare --export, then we would have to invoke special code that would drop all #sql- tables from InnoDB, before executing ha_signal_ddl_recovery_done(). In any case, we must somehow compensate for the change that was made in MDEV-25666.

      Note: Nothing must change in mariabackup outside --prepare --export. We do not want to ruin incremental backups by executing actions that would generate InnoDB redo log records.

      Attachments

        Issue Links

          Activity

            MDEV-24184 was filed previously for unifying the handling of file creation, deletion, and rename operations during backup, with the handling that takes place during crash recovery.

            I think that a long term goal should be to remove the need to run mariabackup --prepare, that is, to allow the normal server recovery to work on files generated by mariabackup --backup. There could be some special server options to automatically shut down after running steps that were part of mariabackup --prepare.

            marko Marko Mäkelä added a comment - MDEV-24184 was filed previously for unifying the handling of file creation, deletion, and rename operations during backup, with the handling that takes place during crash recovery. I think that a long term goal should be to remove the need to run mariabackup --prepare , that is, to allow the normal server recovery to work on files generated by mariabackup --backup . There could be some special server options to automatically shut down after running steps that were part of mariabackup --prepare .
            monty Michael Widenius added a comment - - edited

            In 10.6 we have a log of all ddl's (ddl.log). It would be trivia to add all #sql- tables created to it in ALTER TABLE.
            When we have this, the change to mariabackup would be:

            • Don't copy any files beginning with #sql-
            • Before doing BACKUP STAGE end, copy also the ddl_recovery.log file
            • When doing finalize of the backup (--prepare), drop all #sql- tables found in ddl.log

            I have verified with Marko that it should work (or should be trivial to achieve) that we can drop InnoDB tables even if the .idb file does not exists (which means we don't have to copy the #sql- files)

            monty Michael Widenius added a comment - - edited In 10.6 we have a log of all ddl's (ddl.log). It would be trivia to add all #sql- tables created to it in ALTER TABLE. When we have this, the change to mariabackup would be: Don't copy any files beginning with #sql- Before doing BACKUP STAGE end, copy also the ddl_recovery.log file When doing finalize of the backup (--prepare), drop all #sql- tables found in ddl.log I have verified with Marko that it should work (or should be trivial to achieve) that we can drop InnoDB tables even if the .idb file does not exists (which means we don't have to copy the #sql- files)

            I think that files with names #sql-ib*.ibd must always be copied. After MDEV-25506 part 3, those files always are associated with uncommitted or unfinished transactions.

            Furthermore, if we choose not to copy other #sql-*.ibd files, I do not think that incremental backups can work. If my understanding is correct, incremental backup could work if we always backed up the latest ddl_recovery.log, and on mariabackup --prepare never applied that log. The only exceptions should be partial backups (which may choose to omit any files) and mariabackup --prepare --export (which I think needs to apply the ddl_recovery.log). Normally, after restoring a backup, the ddl_recovery.log should be applied by server startup.

            Note: I do not know how exactly backup locks work, and I now realize that if the effects of invoking purge_sys.resume_SYS() become visible in backup, then the MDEV-25180 Atomic ALTER TABLE recovery may work incorrectly on the backup. If any calls to purge_sys.resume_SYS() are blocked at the final phase of the backup (which finishes copying the redo log and determines the final LSN), then there should be no problem related to that.

            marko Marko Mäkelä added a comment - I think that files with names #sql-ib*.ibd must always be copied. After MDEV-25506 part 3, those files always are associated with uncommitted or unfinished transactions. Furthermore, if we choose not to copy other #sql-*.ibd files, I do not think that incremental backups can work. If my understanding is correct, incremental backup could work if we always backed up the latest ddl_recovery.log , and on mariabackup --prepare never applied that log. The only exceptions should be partial backups (which may choose to omit any files) and mariabackup --prepare --export (which I think needs to apply the ddl_recovery.log ). Normally, after restoring a backup, the ddl_recovery.log should be applied by server startup. Note: I do not know how exactly backup locks work, and I now realize that if the effects of invoking purge_sys.resume_SYS() become visible in backup, then the MDEV-25180 Atomic ALTER TABLE recovery may work incorrectly on the backup. If any calls to purge_sys.resume_SYS() are blocked at the final phase of the backup (which finishes copying the redo log and determines the final LSN), then there should be no problem related to that.

            wlad, I am assigning this to you because you are working on a solution.

            The current solution would be roughly as follows:

            1. We will rely on backup locks, so that any commit of almost-finished DDL will be blocked.
            2. Mariabackup would not copy any #sql* files, because they are not protected by MDL.
            3. On InnoDB startup, if we determine that we started from a restored backup (the size of ib_logfile0 was 0 bytes), in innodb_ddl_recovery_done() (or earlier, before starting purge and the FTS subsystem) we would invoke code that would drop any #sql tables from the InnoDB internal data dictionary.
            marko Marko Mäkelä added a comment - wlad , I am assigning this to you because you are working on a solution. The current solution would be roughly as follows: We will rely on backup locks, so that any commit of almost-finished DDL will be blocked. Mariabackup would not copy any #sql* files, because they are not protected by MDL. On InnoDB startup, if we determine that we started from a restored backup (the size of ib_logfile0 was 0 bytes), in innodb_ddl_recovery_done() (or earlier, before starting purge and the FTS subsystem) we would invoke code that would drop any #sql tables from the InnoDB internal data dictionary.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.