Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38362

Develop an efficient alternative to mbstream

    XMLWordPrintable

Details

    • Q1/2026 Server Maintenance

    Description

      Yesterday, I became aware that in some performance tests that rahulraj conducted, the bulk of the time of restoring a backup was spent in mbstream -x extracting files from the 3,473,676,138,931-byte (3.16 TiB) archive to a file system on a different device.

      time mbstream -x -p 1 …

      real    51m22.395s
      user    1m39.040s
      sys     41m31.011s
      

      This is not only spending a lot of real time, but also a huge amount of CPU time in both user and and system address space. Side note: mariadb-backup --prepare with the default 96MiB buffer pool consumed 31.6 seconds to apply about 200MiB of log. That could have been maybe twice as fast (15 seconds) if more memory had been configured. (Nothing compared to the mbstream execution time.)

      The following will not be an apples-to-apples comparison, because we will be timing the copying of the single xbstream file. With dd, (invoking read(2) and write(2) system calls, 1MiB at a time), we were much faster and used a tiny fraction of the CPU time:

      time dd bs=1M …

      3473420582912 bytes (3.5 TB, 3.2 TiB) copied, 2505 s, 1.4 GB/s
      3312755+1 records in
      3312755+1 records out
      real    41m45.226s
      user    0m0.052s
      sys     0m0.179s
      

      We also tested https://github.com/opencoff/fastdd, which is implementing one of the Linux kernel interfaces for optimizing copying between streams. Unlike copy_file_range(2), which can copy up to 2 GiB per system call, the splice(2) and sendfile(2) will by default copy at most 64 KiB per call. That probably explains the increased CPU usage below. Still, some real time will be saved:

      time fastdd bs=1M …

      3.17 TB (3473676138931 bytes) copied in 2163.016232 secs (1605.94 MB/s)
      real    36m3.030s
      user    0m1.098s
      sys     0m5.616s
      

      Here is a summary of the results:

      program throughput real user+sys
      mbstream 1.05 GiB/s 3082.395 s 2590.051 s
      dd 1.29 GiB/s 2505.226 s 0.231 s
      fastdd 1.50 GiB/s 2163.030 s 6.714 s

      So, fastdd yields 42.5% better throughput while using 0.259% of the CPU time consumed by mbstream.

      I was surprised to find no archive program that would make use of any Linux system calls for optimizing copying between file descriptors. Only cp would use copy_file_range, that’s it. I only found https://github.com/teknoraver/car whose main trick is to support block cloning. So, I wrote a crude prototype czar.c that demonstrates what could be done with these system calls on Linux. A performance test with this program is in progress. That would be a more valid comparison to mbstream, writing exactly the same file sizes.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Marko Mäkelä Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.