[MDEV-16457] mariabackup 10.2+ should default to innodb_checksum_algorithm=crc32 Created: 2018-06-11  Updated: 2020-08-25  Resolved: 2018-06-14

Status: Closed
Project: MariaDB Server
Component/s: Backup
Affects Version/s: 10.2.15, 10.3.7
Fix Version/s: 10.2.16, 10.3.8

Type: Bug Priority: Critical
Reporter: Hartmut Holzgraefe Assignee: Marko Mäkelä
Resolution: Fixed Votes: 3
Labels: None
Environment:

Ubuntu 18.04 "Bionic" 64bit, using packages from mariadb.com apt repository


Attachments: File mariabackup.gprof     File xtrabackup.gprof    

 Description   

Bcakups taken with mariabackup take a significant higher amount of user CPU time than with XtraBackup on the same data set. The extra time needed is roughly proportional to the amount of data to back up.

All tests were done on an otherwise idle server, an AMD machine with 8 cores and 48GB RAM.

Different backup formats (simply to a backup directory, stream in xbstream format) where tried, along with different settings for --parallel and --compress.

Backup size in the first test series was ~50GB:

+--------+----------+----------+------------+-------------+-------+----------+
| target | parallel | compress | xtrabackup | mariabackup | diff  | % slower |
+--------+----------+----------+------------+-------------+-------+----------+
| dir    |        1 |        0 |       9.19 |       82.37 | 73.18 |      796 |
| dir    |        2 |        0 |       9.09 |       82.38 | 73.29 |      806 |
| dir    |        4 |        0 |       9.06 |        84.1 | 75.04 |      828 |
| dir    |        1 |        1 |     151.86 |      224.64 | 72.78 |       48 |
| dir    |        2 |        1 |     155.67 |      225.55 | 69.88 |       45 |
| dir    |        4 |        1 |     156.05 |      224.93 | 68.88 |       44 |
| stream |        1 |        0 |      42.78 |      114.75 | 71.97 |      168 |
| stream |        2 |        0 |      42.98 |      116.28 | 73.30 |      171 |
| stream |        4 |        0 |      42.19 |      114.57 | 72.38 |      172 |
| stream |        1 |        1 |     165.32 |      240.84 | 75.52 |       46 |
| stream |        2 |        1 |     170.08 |      239.74 | 69.66 |       41 |
| stream |        4 |        1 |     171.62 |       239.9 | 68.28 |       40 |
+--------+----------+----------+------------+-------------+-------+----------+

In the 2nd series the backup size was close to 90GB:

+--------+----------+----------+------------+-------------+--------+----------+
| target | parallel | compress | xtrabackup | mariabackup | diff   | % slower |
+--------+----------+----------+------------+-------------+--------+----------+
| dir    |        1 |        0 |      15.23 |      144.99 | 129.76 |      852 |
| dir    |        2 |        0 |      15.25 |       143.9 | 128.65 |      844 |
| dir    |        4 |        0 |      15.31 |      146.64 | 131.33 |      858 |
| dir    |        1 |        1 |      271.1 |      402.57 | 131.47 |       48 |
| dir    |        2 |        1 |     279.34 |      401.67 | 122.33 |       44 |
| dir    |        4 |        1 |     281.94 |      399.81 | 117.87 |       42 |
| stream |        1 |        0 |      73.08 |       200.5 | 127.42 |      174 |
| stream |        2 |        0 |      74.39 |      201.75 | 127.36 |      171 |
| stream |        4 |        0 |      73.28 |      200.19 | 126.91 |      173 |
| stream |        1 |        1 |     299.54 |      428.48 | 128.94 |       43 |
| stream |        2 |        1 |     306.58 |      425.01 | 118.43 |       39 |
| stream |        4 |        1 |     308.71 |      430.44 | 121.73 |       39 |
+--------+----------+----------+------------+-------------+--------+----------+

So regardless of backup options used mariabackup always needed an extra ~70 seconds of user CPU time for the 50GB test, and an extra ~120s for the 90GB test.

The database was idle while taking the backup, and for the stream output tests the output stream was directed to /dev/null to cut out write costs as much as possible.

System CPU time was similar across all tests as it mostly just measured the kernel time needed to read and write the backup data.



 Comments   
Comment by Hartmut Holzgraefe [ 2018-06-11 ]

Difference seems to be related to checksum calculations?

$ head *.gprof
==> mariabackup.gprof <==
Flat profile:
 
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 90.83     74.99    74.99  1805449     0.00     0.00  buf_calc_page_new_checksum(unsigned char const*)
  7.64     81.30     6.31  3579672     0.00     0.00  ut_crc32_hw(unsigned char const*, unsigned long)
  0.93     82.07     0.77  1843329     0.00     0.00  buf_page_is_corrupted(bool, unsigned char const*, page_size_t const&, fil_space_t const*)
  0.10     82.15     0.08     2960     0.00     0.03  xb_fil_cur_read(xb_fil_cur_t*)
  0.09     82.22     0.08  1805448     0.00     0.00  buf_page_is_checksum_valid_innodb(unsigned char const*, unsigned long, unsigned long)
 
==> xtrabackup.gprof <==
Flat profile:
 
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 82.68      7.35     7.35  3611495     0.00     0.00  ut_crc32_hw(unsigned char const*, unsigned long)
  8.89      8.14     0.79  1843451     0.00     0.00  buf_page_is_corrupted(bool, unsigned char const*, page_size_t const&, bool)
  7.09      8.77     0.63    15625     0.04     0.04  buf_calc_page_new_checksum(unsigned char const*)
  1.01      8.86     0.09  3686813     0.00     0.00  Encryption::is_encrypted_page(unsigned char const*)
  0.22      8.88     0.02     3358     0.01     0.01  os_file_io(IORequest const&, int, void*, unsigned long, unsigned long, dberr_t*)

Comment by Marko Mäkelä [ 2018-06-14 ]

There are multiple algorithms for the InnoDB page checksum. CRC-32C (innodb_checksum_algorithm=crc32) should be the default since 10.2. The routine buf_calc_page_new_checksum() should be used for innodb_checksum_algorithm=innodb only.

The problem appears to be that in Mariabackup 10.2, the default is innodb_checksum_algorithm=innodb instead of crc32:

diff --git a/extra/mariabackup/xtrabackup.cc b/extra/mariabackup/xtrabackup.cc
index 4d88778f020..bd5e28a4f5f 100644
--- a/extra/mariabackup/xtrabackup.cc
+++ b/extra/mariabackup/xtrabackup.cc
@@ -1193,7 +1193,7 @@ struct my_option xb_server_options[] =
   "The algorithm InnoDB uses for page checksumming. [CRC32, STRICT_CRC32, "
    "INNODB, STRICT_INNODB, NONE, STRICT_NONE]", &srv_checksum_algorithm,
    &srv_checksum_algorithm, &innodb_checksum_algorithm_typelib, GET_ENUM,
-   REQUIRED_ARG, SRV_CHECKSUM_ALGORITHM_INNODB, 0, 0, 0, 0, 0},
+   REQUIRED_ARG, SRV_CHECKSUM_ALGORITHM_CRC32, 0, 0, 0, 0, 0},
 
   {"innodb_undo_directory", OPT_INNODB_UNDO_DIRECTORY,
    "Directory where undo tablespace files live, this path can be absolute.",

Workaround: Invoke mariabackup --innodb-checksum-algorithm=crc32.

Generated at Thu Feb 08 08:29:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.