[MDEV-38362] Develop an efficient alternative to mbstream - Jira

XML

Word

Printable

Details

Type: New Feature
Status: In Progress (View Workflow)
Priority: Critical
Resolution: Unresolved
Fix Version/s: 13.1
Component/s: Backup, Server
Labels:
- performance

Epic Link:
In-Server Backup, phase 1
Sprint:
Q1/2026 Server Development, Q1/2026 Server Maintenance, Q2/2026 Server Development

Description

Yesterday, I became aware that in some performance tests that rahulraj conducted, the bulk of the time of restoring a backup was spent in mbstream -x extracting files from the 3,473,676,138,931-byte (3.16 TiB) archive to a file system on a different device.

time mbstream -x -p 1 …
real 51m22.395s
user 1m39.040s
sys 41m31.011s

This is not only spending a lot of real time, but also a huge amount of CPU time in both user and and system address space. Side note: mariadb-backup --prepare with the default 96MiB buffer pool consumed 31.6 seconds to apply about 200MiB of log. That could have been maybe twice as fast (15 seconds) if more memory had been configured. (Nothing compared to the mbstream execution time.)

The following will not be an apples-to-apples comparison, because we will be timing the copying of the single xbstream file. With dd, (invoking read(2) and write(2) system calls, 1MiB at a time), we were much faster and used a tiny fraction of the CPU time:

time dd bs=1M …
3473420582912 bytes (3.5 TB, 3.2 TiB) copied, 2505 s, 1.4 GB/s
3312755+1 records in
3312755+1 records out
real 41m45.226s
user 0m0.052s
sys 0m0.179s

We also tested https://github.com/opencoff/fastdd, which is implementing one of the Linux kernel interfaces for optimizing copying between streams. Unlike copy_file_range(2), which can copy up to 2 GiB per system call, the splice(2) and sendfile(2) will by default copy at most 64 KiB per call. That probably explains the increased CPU usage below. Still, some real time will be saved:

time fastdd bs=1M …
3.17 TB (3473676138931 bytes) copied in 2163.016232 secs (1605.94 MB/s)
real 36m3.030s
user 0m1.098s
sys 0m5.616s

Here is a summary of the results:

program	throughput	real	user+sys
`mbstream`	1.05 GiB/s	3082.395 s	2590.051 s
`dd`	1.29 GiB/s	2505.226 s	0.231 s
`fastdd`	1.50 GiB/s	2163.030 s	6.714 s

So, fastdd yields 42.5% better throughput while using 0.259% of the CPU time consumed by mbstream.

I was surprised to find no archive program that would make use of any Linux system calls for optimizing copying between file descriptors. Only cp would use copy_file_range, that’s it. I only found https://github.com/teknoraver/car whose main trick is to support block cloning. So, I wrote a crude prototype czar.c that demonstrates what could be done with these system calls on Linux. A performance test with this program is in progress. That would be a more valid comparison to mbstream, writing exactly the same file sizes.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

czar.c
3 kB
2025-12-17 13:21

Issue Links

blocks

MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER

Open

MDEV-39089 Incremental BACKUP SERVER based on copying changed pages

Open

is part of

MDEV-14992 BACKUP SERVER to mounted file system

In Progress

relates to

MDEV-38507 fadvise64() called on socat pipe by Mariabackup is useless overhead (ESPIPE)

Closed

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Assigned for Implementation:: Marko Mäkelä

Votes:: 1 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 2025-12-17 13:26

Updated:: 2 days ago 19:05

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.