[MDEV-14077] Incremental backup extremly slow Created: 2017-10-16  Updated: 2020-12-08  Resolved: 2017-12-19

Status: Closed
Project: MariaDB Server
Component/s: Backup
Affects Version/s: 10.1.28
Fix Version/s: 10.1.30, 10.2.12

Type: Bug Priority: Major
Reporter: Kristian Klausen Assignee: Vladislav Vaintroub
Resolution: Fixed Votes: 0
Labels: upstream
Environment:

OS: ArchLinux
mariabackup: Patched version with extra debug, based on 10.1 (98cd0ec536915b25a841ffc227285b15f35acef7)
MariaDB: 10.1.28


Attachments: File foo.patch     File xtrabackup.cc.patch     File xtrabackup.cc.patch     File xtrabackup.cc.patch    
Sprint: 10.1.30

 Description   

Hello

Taking incremental backup like: `mariabackup --backup --incremental-lsn=xx` is way slower than taking a full backup, something like 20x+ as slow.

I did a bit of debugging, and noticed that it spent way longer in xtrabackup_copy_datafile, when doing inc backup (~ 0.02 vs ~ 0.001 (full backup)).

After a bit more trail-and-error, I found that it spent most of the time in a `memset` line. This line: https://github.com/MariaDB/server/blob/da05d0276a0569341c8bb41365dc7b05f9c4ddb7/extra/mariabackup/write_filt.cc#L80

I found that by adding the following print statement to the file: https://github.com/klausenbusk/server/commit/afcdb9128927e714ee0e3bcf5c14fea2f56855e0
Which gave me:
inc: 0.000001
inc: 0.000004
inc: 0.027387
inc: 0.027435

Ideally the code should only write delta file, when something has changed. At the moment it write a lot of "dummy" delta files, which only contain the header.

Regards Kristian Klausen



 Comments   
Comment by Marko Mäkelä [ 2017-10-16 ]

This affects also xtrabackup.

Comment by Kristian Klausen [ 2017-10-16 ]

I have attached a patch, which skip all the not changed "table" instead of creating dummy files.

Now most of the time is spent copying *.frm files, maybe we can improve that?

Comment by Kristian Klausen [ 2017-10-17 ]

If forgot a `if`, to ensure the logic only run when doing incremental backup. See updated patch.

Comment by Kristian Klausen [ 2017-10-17 ]

So I did a experiment.
The `foo.patch` add a new flag (--incremental-foo) and field to xtrabackup_checkpoints (foo), which contain a unix timestamp.
It then skip all "non-InnoDB tables and files" which hasn't changed since `incremental-foo`.

I'm not sure it worth the effort, and there is also the clock screw concern.

Comment by Kristian Klausen [ 2017-10-18 ]

New patch again, now the skipping logic should work with with and without page tracking support (bitmap).

Comment by Kristian Klausen [ 2017-10-18 ]

Hmm, the "dummy" delta files is needed for the prepare phase:

"Dummy" files are used by xtrabackup to identify that table is still there and was not removed between full and incremental backups. Tables without deltas will be removed during incremental prepare.

https://www.percona.com/forums/questions-discussions/percona-xtrabackup/49859-incremental-backup-extremly-slow-mariadb-10-1-28-galera-cluster?p=49874#post49874

Maybe we would write a list of table at the end? Seems more effective than creating a lot of small files.

Comment by Vladislav Vaintroub [ 2017-10-25 ]

If making a full backup is not a big deal for you, why not use a binary diff tool, e.g rdiff, instead of this incremental backup

Comment by Kristian Klausen [ 2017-10-27 ]

> If making a full backup is not a big deal for you, why not use a binary diff tool, e.g rdiff, instead of this incremental backup
It adds unneeded complexity. I store the backup on object storage services (Backblaze B2 & Amazon S3), so I can't use complex tool without copying a lot of files back and forth.

With mariabackup I only need to push files, with rdiff I would need to pull the files from the last backup <somehow>. It would be much easier if mariabackup could just be fixed.

Comment by Kristian Klausen [ 2017-11-09 ]

Reported upstream: https://bugs.launchpad.net/percona-xtrabackup/+bug/1731249

Comment by Vladislav Vaintroub [ 2017-12-12 ]

klausenbusk, did you have a chance to look at how it performs for you in 10.1.29, since it already contains a patch that I think would speed up your backup considerably.

Comment by Vladislav Vaintroub [ 2017-12-19 ]

Closing due to no feedback. Please feel free to comment, if it did not solve it for you

Comment by Kristian Klausen [ 2018-04-14 ]

wlad sorry for the late respond. The fix [1] does indeed fix it, full backup takes around ~ 3 min now and inc backup around ~ 1,5 min.

Many thanks
I did stumble on a packing issue, when changing my script though: https://jira.mariadb.org/browse/MDEV-15869

[1] https://github.com/MariaDB/server/commit/53c7aaf332d31d0441a533fd9a91b380169ab611

Generated at Thu Feb 08 08:10:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.