[MDEV-16165] mariabackup takes ~50% more user CPU time than xtrabackup with --compress Created: 2018-05-14  Updated: 2020-08-25  Resolved: 2018-06-08

Status: Closed
Project: MariaDB Server
Component/s: mariabackup
Affects Version/s: 10.2.14
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Hartmut Holzgraefe Assignee: Vladislav Vaintroub
Resolution: Incomplete Votes: 1
Labels: backup, performance

Attachments: Text File m.txt     Text File x.txt    

 Description   

When creating a compressed backup mariabackup takes about 50% more user CPU time than xtrabackup (2.4.10). This is true when using the mariadb.com Ubuntu repository to install binaries, and also for binaries I compile myself (using the respective CMake default settings).

The bundled "quicklz" library source files are exactly the same, so the difference must most likely be in the compiler optimization settings.

The most obvious difference was that xtrabackup by default builds with -O3, while mariadb builds with -O2, but changing the optimization level to -O3 did not significantly improve the CPU usage either.

Whether total backup time is affected by this depends on whether the backup is CPU or IO bound, so on compressibility of the data and IO bandwidth.

Real (wall clock) and CPU user time for different --compress and --parallel settings:

+----------+----------+---------+---------+---------+---------+-------------+
| compress | parallel | xt_real | mb_real | xt_user | mb_user | user diff % |
+----------+----------+---------+---------+---------+---------+-------------+
| ---      |        - |     813 |     805 |      77 |      81 |           5 |
| yes      |        1 |     358 |     456 |     230 |     329 |          43 |
| yes      |        2 |     188 |     233 |     217 |     317 |          46 |
| yes      |        3 |     142 |     164 |     200 |     310 |          55 |
| yes      |        4 |     146 |     165 |     200 |     306 |          53 |
+----------+----------+---------+---------+---------+---------+-------------+



 Comments   
Comment by Hartmut Holzgraefe [ 2018-05-14 ]

Attached files m.txt and x.txt contain actual CC command lines issued by "make" for mariadb and xtrabackup, respectively.
Options have been split up to different lines, and then sorted alphabetically, to make side-by-side comparison more easy.

Comment by Vladislav Vaintroub [ 2018-05-14 ]

We do not want to support builtin compress. There are better ways to compress backups than builtin, which is only left for compatibility reasons. This is also highlighted in our documentation.

Comment by Hartmut Holzgraefe [ 2018-05-16 ]

I'm currently running tests with different compression tools capable of multi-threaded compression.

While all of them seem to deliver better compression than the embedded quicklz,
with all of them backup takes substantially more time to complete. So far pbzip2
and pigz seem to perform best, but even at compression level 1 ("fastest") and
four threads backup takes about 2.5 times as long as "--compress --parallel=4",
and total consumed "user" CPU time of mariabackup plus the compression tool
is also substantially higher than with the embedded library, too.

So IMHO --compress should not be deprecated unless we come up with a solution
using xbstream and an external compression tool that comes up with a similar
CPU time vs. compression factor profile as the embedded quicklz.

I'll re-run tests with more realistic data, as right now I use rows that mostly
consist of a large string of "aaaaa.....aaaaa" only, to ensure that the backup
time is CPU bound, not IO bound, on my test machine at home. Maybe the
CPU usage advantage of quicklz diminishes with less easy to compress
data, but maybe its disadvantage in compression rate also dimishes with that.

Once done I'll post exact numbers for all combinations I tested.

Comment by Hartmut Holzgraefe [ 2018-05-23 ]

First tests show that xbstream + qpress takes twice as much time as internal --compress, even though it uses exactly the same compression code, and produces equal size results.

I assume that the pipe between the processes may be the bottleneck, but need to run a few further tests with more realistic data to confirm. As this requires non-trivial data sizes generating test data and re-running tests will take another day.

(the data set I used so far allowed for very high compression factors, consisting mostly of long "aaaa.....aaa" varchars only, to make the compression part fully CPU bound, and not limited by IO write bandwidth)

Comment by Hartmut Holzgraefe [ 2018-05-28 ]

Similar results with a more realistic data set that can be compressed by a factor of 3 to 8, depending on which tool and compression level is used on it.

When only looking at total time it takes to take a backup --compress is the only variant that actually saves time and space.
External qpress comes out fastest of the external tools. "pigz" comes out second, with about twice the time, and only slightly better compression than qpress. "pbzip2" and "lbzip2" come out as a good compromise when using fastest compression (-1), with about 3x the backup time of external qpress, and only about half the backup size at the same time.

So for users that do not care about size reduction that much, or who just want to make the backup process faster in wallclock time (taking actual compression just as a 'collateral'), --compress still looks like a viable option. And that's just mariabackup --compress, this is not even taking the extra speedup of xtrabackup --compress into account.

So IMHO --compress should stay, but should either support multiple compression library, or 2nd best just be switched from using qpress to libbzip2.

That way the overhead of the xbstream pipe in extra system calls and context switches (especially as the full uncompressed backup data needs to be passed through the pipe) could be avoided.

I also did some simple "cat < bigfile > tmpfile" vs. "cat < bigfile | cat > tmpfile" tests, and the result was that system and wallclock time for the left side cat increase by about 20% with this (system takes the majority of the processing time here). At the same time the user cpu time increased by a factor of ~4 (negligible here as the vast majority of time was spent in system mode, but still)

Comment by Hartmut Holzgraefe [ 2018-06-08 ]

And now for the embarrassing part (at least for me, and for the 2nd time this week already):

The "user" CPU numbers in the very first line are apparently wrong.

When not using compression the consumed CPU time is not almost the same.

I have no idea what went wrong there, I can only assume that I actually measured with xtrabackup twice there.

The actual numbers are quite different between the two, and even more different when running without --compress than when running with.

So it now looks as if it is actually not the compression part that's behaving differently performance wise, it is the backup process itself. The relative difference is still visible when using --compress, but it is even larger when running without it.

I'll re-run the test series once more, and if I can reproduce yesterdays results I'll create a new bug report from that, with numbers and "how to reproduce" setup description.

This report can probably be closed after all ...

Comment by Vladislav Vaintroub [ 2018-06-08 ]

Closed as requested by hholzgra. When filing next time, it would be very helpful, if possible, to include profiler output (or even gdb stacktraces) of mariabackup as compared to xtrabackup.

Generated at Thu Feb 08 08:26:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.