Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-39089

Incremental BACKUP SERVER based on copying changed pages

    XMLWordPrintable

Details

    Description

      As noted in MDEV-39054, an incremental backup based on permanently enabling innodb_log_archive=ON (MDEV-37949) will be less efficient than mariadb-backup when there are lots of writes.

      Therefore, we do need an incremental InnoDB backup that is similar to what mariadb-backup currently supports. This involves extending the BACKUP SERVER statement as well as InnoDB startup so that changed page images will be copied and applied.

      Instead of introducing a whole new format like the .delta files that mariadb-backup is creating, I think that we should rely on sparse files. They work in any modern file system, thanks to their Unix and POSIX heritage.

      Not for the system tablespace

      To keep things simple, it might not make any sense to apply incremental backup to the system tablespace, which can be configured to be in multiple files of arbitrary names. Ever since MDEV-29983 and MDEV-29986 were implemented, the system tablespace should be very small. This limitation should therefore be acceptable.

      Changed pages in sparse files

      For undo tablespace files (undo001 to undo127) as well as for the schemaname/tablename.ibd files, we could use the .ibi file name suffix (InnoDB incremental).

      The .ibi file would always start with a copy of the first page of the corresponding undo* or .ibd file, so that the file can be identified reliably.

      If the entire file has been created or rewritten since the end LSN of the previous backup, then the .ibi file will be identical to the corresponding undo* or .ibd file.

      For any data pages whose FIL_PAGE_LSN is less than the end LSN of the previous backup, the .ibi file will contain a hole. For example, if only the page 3 of t1.ibd has been modified since the previous backup, then the file t1.ibi will contain recent of pages 0 and 3, and everything else will be a hole. In this way, the file t1.ibi will consume 2 physical pages of storage, plus some metadata to identify it as a sparse file.

      Applying incremental backup

      On startup, InnoDB notices a special marker file (say, the empty file ib_logfile0.ibi) that signals that .ibi file recovery needs to run before any log based recovery.

      The incremental backup will include an .ibi file for every data file that existed when the backup was made.

      Consider the following scenario (with provisional BACKUP SERVER syntax):

      CREATE TABLE a(a INT) ENGINE=InnoDB;
      CREATE TABLE b(a INT) ENGINE=InnoDB;
      CREATE TABLE c(a INT) ENGINE=InnoDB;
      BACKUP SERVER TO '/backup/full';
      DROP TABLE a;
      INSERT INTO b VALUES(1);
      INSERT INTO c VALUES(2);
      RENAME TABLE b TO a;
      BACKUP SERVER INCREMENT ... TO '/backup/incremental';
      

      When applying the incremental backup to the base (the exact steps are to be defined), we will find the files a.ibd and b.ibd in the base backup and a.ibi in the incremental backup.

      To correctly apply the incremental backup, we must read the first page of each .ibd and .ibi file, to construct a mapping between the 32-bit FIL_PAGE_SPACE_ID and the corresponding file names. This mapping, along with a checksum, will be durably written to the marker file ib_logfile0.ibi, which acts as a write-ahead log for the .ibi recovery, to keep things correct and efficient if the server is killed during this recovery.

      For the above example, based on our mapping, we will determine that the file a.ibd must be deleted. We will perform the deletions first in order to save space and to keep things crash-safe.

      Next, we will replace any sparse pages in b.ibi with the contents of the matching file a.ibd, and rename b.ibi to a.ibd.

      Similarly, we will apply the changes from c.ibd to c.ibi and rename c.ibi to c.ibd.

      Finally, we delete the incremental recovery log file ib_logfile0.ibi.

      What if the server is killed and restarted before the last step is completed? We would find something like the following in the ib_logfile0.ibi file:

      FIL_PAGE_SPACE_ID ibd file name ibi file name
      42 a.ibd  
      43 b.ibd a.ibi
      44 c.ibd c.ibi

      Recovery will attempt to read the first pages of all the mentioned files. We will find 43 in the file a.ibd. Because the file contains information that a.ibi used to contain that tablespace ID, we can infer that the recovery for all these files had been completed. If no file contains the "surviving" tablespace IDs 43 and 44, then we must report an inconsistency and abort the process. If c.ibd contains 44, we can safely delete the file ib_logfile0.ibi and carry on with the log-based recovery.

      It may turn out that we must write detailed write-ahead log records to this file, such as "apply b.ibd to a.ibi" and "move a.ibi to a.ibd". In that case, the ib_logfile0.ibi recovery would redo the last found record. To safely modify the ib_logfile0.ibi file, we might first write a file with a temporary name, say, ib_logfile101 (which is already being used by MDEV-27812), and atomically replace the ib_logfile0.ibi. In that way, we should be able to rely on the crash safety of the file system, without requiring any logic to deal with incorrectly or partially written records.

      Streaming backup

      The suggested sparse .ibi files are supported in the ustar format of the tar utility, which has been suggested to be used in MDEV-38362.

      Apparently, the tar utility on Windows does may have problems when extracting archives that contain sparse files: https://github.com/libarchive/libarchive/issues/2138

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.