Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.1.28
    • 10.1.30, 10.2.12
    • Backup
    • OS: ArchLinux
      mariabackup: Patched version with extra debug, based on 10.1 (98cd0ec536915b25a841ffc227285b15f35acef7)
      MariaDB: 10.1.28
    • 10.1.30

    Description

      Hello

      Taking incremental backup like: `mariabackup --backup --incremental-lsn=xx` is way slower than taking a full backup, something like 20x+ as slow.

      I did a bit of debugging, and noticed that it spent way longer in xtrabackup_copy_datafile, when doing inc backup (~ 0.02 vs ~ 0.001 (full backup)).

      After a bit more trail-and-error, I found that it spent most of the time in a `memset` line. This line: https://github.com/MariaDB/server/blob/da05d0276a0569341c8bb41365dc7b05f9c4ddb7/extra/mariabackup/write_filt.cc#L80

      I found that by adding the following print statement to the file: https://github.com/klausenbusk/server/commit/afcdb9128927e714ee0e3bcf5c14fea2f56855e0
      Which gave me:
      inc: 0.000001
      inc: 0.000004
      inc: 0.027387
      inc: 0.027435

      Ideally the code should only write delta file, when something has changed. At the moment it write a lot of "dummy" delta files, which only contain the header.

      Regards Kristian Klausen

      Attachments

        1. foo.patch
          5 kB
        2. xtrabackup.cc.patch
          0.9 kB
        3. xtrabackup.cc.patch
          0.7 kB
        4. xtrabackup.cc.patch
          0.6 kB

        Activity

          klausenbusk Kristian Klausen created issue -
          marko Marko Mäkelä made changes -
          Field Original Value New Value
          Fix Version/s 10.1 [ 16100 ]
          Fix Version/s 10.2 [ 14601 ]
          marko Marko Mäkelä made changes -
          Assignee Vladislav Vaintroub [ wlad ]
          marko Marko Mäkelä added a comment - This affects also xtrabackup.
          klausenbusk Kristian Klausen made changes -
          Attachment xtrabackup.cc.patch [ 44345 ]
          klausenbusk Kristian Klausen added a comment - - edited

          I have attached a patch, which skip all the not changed "table" instead of creating dummy files.

          Now most of the time is spent copying *.frm files, maybe we can improve that?

          klausenbusk Kristian Klausen added a comment - - edited I have attached a patch, which skip all the not changed "table" instead of creating dummy files. Now most of the time is spent copying *.frm files, maybe we can improve that?
          klausenbusk Kristian Klausen made changes -
          Attachment xtrabackup.cc.patch [ 44349 ]

          If forgot a `if`, to ensure the logic only run when doing incremental backup. See updated patch.

          klausenbusk Kristian Klausen added a comment - If forgot a `if`, to ensure the logic only run when doing incremental backup. See updated patch.
          klausenbusk Kristian Klausen made changes -
          Attachment foo.patch [ 44351 ]

          So I did a experiment.
          The `foo.patch` add a new flag (--incremental-foo) and field to xtrabackup_checkpoints (foo), which contain a unix timestamp.
          It then skip all "non-InnoDB tables and files" which hasn't changed since `incremental-foo`.

          I'm not sure it worth the effort, and there is also the clock screw concern.

          klausenbusk Kristian Klausen added a comment - So I did a experiment. The `foo.patch` add a new flag (--incremental-foo) and field to xtrabackup_checkpoints (foo), which contain a unix timestamp. It then skip all "non-InnoDB tables and files" which hasn't changed since `incremental-foo`. I'm not sure it worth the effort, and there is also the clock screw concern.
          klausenbusk Kristian Klausen made changes -
          Attachment xtrabackup.cc.patch [ 44352 ]

          New patch again, now the skipping logic should work with with and without page tracking support (bitmap).

          klausenbusk Kristian Klausen added a comment - New patch again, now the skipping logic should work with with and without page tracking support (bitmap).

          Hmm, the "dummy" delta files is needed for the prepare phase:

          "Dummy" files are used by xtrabackup to identify that table is still there and was not removed between full and incremental backups. Tables without deltas will be removed during incremental prepare.

          https://www.percona.com/forums/questions-discussions/percona-xtrabackup/49859-incremental-backup-extremly-slow-mariadb-10-1-28-galera-cluster?p=49874#post49874

          Maybe we would write a list of table at the end? Seems more effective than creating a lot of small files.

          klausenbusk Kristian Klausen added a comment - Hmm, the "dummy" delta files is needed for the prepare phase: "Dummy" files are used by xtrabackup to identify that table is still there and was not removed between full and incremental backups. Tables without deltas will be removed during incremental prepare. https://www.percona.com/forums/questions-discussions/percona-xtrabackup/49859-incremental-backup-extremly-slow-mariadb-10-1-28-galera-cluster?p=49874#post49874 Maybe we would write a list of table at the end? Seems more effective than creating a lot of small files.

          If making a full backup is not a big deal for you, why not use a binary diff tool, e.g rdiff, instead of this incremental backup

          wlad Vladislav Vaintroub added a comment - If making a full backup is not a big deal for you, why not use a binary diff tool, e.g rdiff, instead of this incremental backup

          > If making a full backup is not a big deal for you, why not use a binary diff tool, e.g rdiff, instead of this incremental backup
          It adds unneeded complexity. I store the backup on object storage services (Backblaze B2 & Amazon S3), so I can't use complex tool without copying a lot of files back and forth.

          With mariabackup I only need to push files, with rdiff I would need to pull the files from the last backup <somehow>. It would be much easier if mariabackup could just be fixed.

          klausenbusk Kristian Klausen added a comment - > If making a full backup is not a big deal for you, why not use a binary diff tool, e.g rdiff, instead of this incremental backup It adds unneeded complexity. I store the backup on object storage services (Backblaze B2 & Amazon S3), so I can't use complex tool without copying a lot of files back and forth. With mariabackup I only need to push files, with rdiff I would need to pull the files from the last backup <somehow>. It would be much easier if mariabackup could just be fixed.
          serg Sergei Golubchik made changes -
          Priority Critical [ 2 ] Major [ 3 ]
          serg Sergei Golubchik made changes -
          Labels upstream
          klausenbusk Kristian Klausen added a comment - Reported upstream: https://bugs.launchpad.net/percona-xtrabackup/+bug/1731249
          serg Sergei Golubchik made changes -
          Sprint 10.1.30 [ 215 ]

          klausenbusk, did you have a chance to look at how it performs for you in 10.1.29, since it already contains a patch that I think would speed up your backup considerably.

          wlad Vladislav Vaintroub added a comment - klausenbusk , did you have a chance to look at how it performs for you in 10.1.29, since it already contains a patch that I think would speed up your backup considerably.
          wlad Vladislav Vaintroub made changes -
          Labels upstream need_feedback upstream

          Closing due to no feedback. Please feel free to comment, if it did not solve it for you

          wlad Vladislav Vaintroub added a comment - Closing due to no feedback. Please feel free to comment, if it did not solve it for you
          wlad Vladislav Vaintroub made changes -
          Fix Version/s 10.1.30 [ 22637 ]
          Fix Version/s 10.2.12 [ 22810 ]
          Fix Version/s 10.2 [ 14601 ]
          Fix Version/s 10.1 [ 16100 ]
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Closed [ 6 ]

          wlad sorry for the late respond. The fix [1] does indeed fix it, full backup takes around ~ 3 min now and inc backup around ~ 1,5 min.

          Many thanks
          I did stumble on a packing issue, when changing my script though: https://jira.mariadb.org/browse/MDEV-15869

          [1] https://github.com/MariaDB/server/commit/53c7aaf332d31d0441a533fd9a91b380169ab611

          klausenbusk Kristian Klausen added a comment - wlad sorry for the late respond. The fix [1] does indeed fix it, full backup takes around ~ 3 min now and inc backup around ~ 1,5 min. Many thanks I did stumble on a packing issue, when changing my script though: https://jira.mariadb.org/browse/MDEV-15869 [1] https://github.com/MariaDB/server/commit/53c7aaf332d31d0441a533fd9a91b380169ab611
          julien.fritsch Julien Fritsch made changes -
          Labels need_feedback upstream upstream
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 83054 ] MariaDB v4 [ 152994 ]

          People

            wlad Vladislav Vaintroub
            klausenbusk Kristian Klausen
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.