[MDEV-11068] Review which innodb_compression_algorithm to support in binary packages Created: 2016-10-17  Updated: 2023-12-12  Resolved: 2023-12-12

Status: Closed
Project: MariaDB Server
Component/s: Packaging, Storage Engine - InnoDB
Fix Version/s: N/A

Type: New Feature Priority: Major
Reporter: Valerii Kravchuk Assignee: Sergei Golubchik
Resolution: Fixed Votes: 4
Labels: Compatibility, compression, packaging, performance

Attachments: PNG File image-2020-05-14-18-12-01-528.png    
Issue Links:
Blocks
blocks MDEV-20255 InnoDB LZ4 and LZMA compression algor... Closed
blocks MDEV-21877 Enable snappy compression by default ... Open
blocks MDEV-22895 Implement server support for making c... Closed
is blocked by MDEV-22310 Support zstd Compression algorithm fo... Open
is blocked by MDEV-26029 Sparse files are inefficient on thinl... Closed
Relates
relates to MDEV-8139 Fix scrubbing Closed
relates to MDEV-11916 Page compression - use smaller writes... Open
relates to MDEV-12933 sort out the compression library chaos Closed
relates to MDEV-15528 Avoid writing freed InnoDB pages Closed
relates to MDEV-22839 ROW_FORMAT=COMPRESSED vs PAGE_COMPRES... Open

 Description   

It seems RPMs for 10.1.x come with only LZMA of all possible compression algorithms. It is suggested to build from source when other algorithms are needed (see https://mariadb.com/kb/en/mariadb/compression/).

On some systems (like RHEL) that we provide packages for some of these algorithms (like LZ4) are available as packages. I think it makes sense (maybe only for Enterprise binaries?) to build with them added and then add dependencies for the related packages.






-
But we cannot add new dependencies to rpm packages after GA. And we should not introduce new file formats lightly. If we add a compression library to our distributed packages, there will be a significant additional cost for removing the code later. Users who enabled an algorithm would have to execute additional steps on an upgrade to a later version where we might want to remove that form of compression. And we would have to provide an upgrade tool for converting affected files. To save us from such trouble, we should run some benchmarks beforehand and determine which library provides the best ratio between CPU usage and compression savings.



 Comments   
Comment by Arnaud Adant [ 2016-10-21 ]

Please note that zlib is also available by default but it would be nice to have the other ones to compare.

Comment by Sergei Golubchik [ 2017-05-27 ]

We cannot add new dependencies to rpm packages after GA.

Comment by Marko Mäkelä [ 2019-03-22 ]

I do not think that it makes sense to enable bzip2 at all. It has a very large memory footprint, and it is designed for compressing much larger input than the innodb_page_size blocks (default 16384 bytes).

Comment by Marko Mäkelä [ 2020-04-21 ]

I think that we will need some benchmarks to check not only which file systems work well with page_compressed tables, but also the efficiency of different compression algorithms (in terms of CPU usage and saved storage space).

The benchmark effort may have to wait for MDEV-11916 and MDEV-8139 to be fixed. In MDEV-15528 we are already enabling additional savings when entire pages are being freed.

Comment by Marko Mäkelä [ 2020-04-23 ]

I think that of all the innodb_compression_algorithm that are implemented in the source code, bzip2 is worst match for InnoDB. The reason is that the ‘input files’ are individual InnoDB pages, at most innodb_page_size bytes of payload to compress. That is, 4KiB, 8KiB, 16KiB, 32KiB, or 64KiB. But, the bzip2 input block size is 1 MiB. The encoder of the compressed data stream could waste some code space for representing longer lengths that would never occur in our use case. Also, the memory usage of bzip2 could be huge. Even the bzip2 --small option is consuming up to 2.5 bytes per input byte.

Maybe we could at least agree to remove bzip2 support in some version? 10.6?

If zlib is soon to support SIMD-based optimizations for modern CPUs, maybe the practical advantage of other implementations of the Lempel-Ziv 1977 and 1978 algorithms will be reduced?

Note: If we ever enable some compression algorithm in our distributed binary packages, then I am afraid that it will be very hard to remove those algorithms later, because users would complain that their data is inaccessible after an upgrade. This could be addressed by creating an external tool that would convert data files, but then that tool would have to depend on all those compression libraries ‘forever’.

Comment by Otto Kekäläinen [ 2020-04-26 ]

I did not spot any compression changes yet in recent 10.5 commits, but the autopkgtest at https://salsa.debian.org/mariadb-team/mariadb-server/-/jobs/692724 started failing with `ERROR 1231 (42000) at line 2: Variable 'innodb_compression_algorithm' can't be set to the value of 'snappy'`. Did you decide to remove snappy already or shall I investigate this as a regression?

Comment by Otto Kekäläinen [ 2020-05-14 ]

Just a reminder that the regression on the 10.5 branch still exists ^. Value 'snappy' for variable `innodb_compression_algorithm` has stopped being a recognized one.

It is rather annoying in my otherwise pretty CI pipeline at https://salsa.debian.org/mariadb-team/mariadb-server/pipelines/136485

Comment by Otto Kekäläinen [ 2020-05-19 ]

Regarding the two comments above, I figured out the snappy and RocksDB failures in debian/test/smoke and will soon submit a PR about them.

Note that in addition to the options listed at https://mariadb.com/kb/en/innodb-page-compression/#configuring-the-innodb-page-compression-algorithm there is also zstd as it is used by RocksDB.

For binaries published in Debian/Ubuntu officially we have only zlib in Ubuntu Bionic (MariaDB 10.1) and both zlib, lz4 and snappy in Ubuntu Focal (MariaDB 10.3).

Snappy was enabled in https://salsa.debian.org/mariadb-team/mariadb-10.3/-/commit/278531a7dfa7d60a60b067d089860c92a4e1221b - was this an OK decision?

If somebody wants to test this, this can be quickly copy-pasted:

mariadb --version
mariadb -e 'SET GLOBAL innodb_compression_algorithm=none;'
mariadb -e 'SET GLOBAL innodb_compression_algorithm=zlib;'
mariadb -e 'SET GLOBAL innodb_compression_algorithm=lz4;'
mariadb -e 'SET GLOBAL innodb_compression_algorithm=lzo;'
mariadb -e 'SET GLOBAL innodb_compression_algorithm=lzma;'
mariadb -e 'SET GLOBAL innodb_compression_algorithm=bzip2;'
mariadb -e 'SET GLOBAL innodb_compression_algorithm=snappy;'

Comment by Otto Kekäläinen [ 2020-05-19 ]

For the record, RocksDB compression status on current 10.5 branch:

# grep -E "(Compression)? supported:" /var/lib/mysql/#rocksdb/LOG
2020/05/19-13:15:14.987332 7fa080c1d800 Compression algorithms supported:
2020/05/19-13:15:14.987333 7fa080c1d800     kZSTDNotFinalCompression supported: 0
2020/05/19-13:15:14.987334 7fa080c1d800     kZSTD supported: 0
2020/05/19-13:15:14.987335 7fa080c1d800     kXpressCompression supported: 0
2020/05/19-13:15:14.987336 7fa080c1d800     kLZ4HCCompression supported: 1
2020/05/19-13:15:14.987337 7fa080c1d800     kLZ4Compression supported: 1
2020/05/19-13:15:14.987338 7fa080c1d800     kBZip2Compression supported: 0
2020/05/19-13:15:14.987338 7fa080c1d800     kZlibCompression supported: 1
2020/05/19-13:15:14.987339 7fa080c1d800     kSnappyCompression supported: 1

Comment by Marko Mäkelä [ 2020-05-19 ]

otto, thank you for investigating this.

I think that we should try to remove the ‘useless’ algortihms (determined by benchmarking) and extend innochecksum (or create a separate tool) to allow re-encoding data files from any previously supported page_compressed algorithm into the supported ones.

Comment by Otto Kekäläinen [ 2020-06-06 ]

Related commit https://github.com/MariaDB/server/commit/6af37ba881fee7e6f651d5e0730c9374337ad1b4 by serg

Did not seem to have had an effect on this. Outputs from latest 10.5 at the time of writing:

root@e1cbb08df912:/etc/mysql# mariadb --version
mariadb  Ver 15.1 Distrib 10.5.4-MariaDB, for debian-linux-gnu (x86_64) using  EditLine wrapper
root@e1cbb08df912:/etc/mysql# mariadb -e 'SET GLOBAL innodb_compression_algorithm=none;'
root@e1cbb08df912:/etc/mysql# mariadb -e 'SET GLOBAL innodb_compression_algorithm=zlib;'
root@e1cbb08df912:/etc/mysql# mariadb -e 'SET GLOBAL innodb_compression_algorithm=lz4;'
root@e1cbb08df912:/etc/mysql# mariadb -e 'SET GLOBAL innodb_compression_algorithm=lzo;'
ERROR 1231 (42000) at line 1: Variable 'innodb_compression_algorithm' can't be set to the value of 'lzo'
root@e1cbb08df912:/etc/mysql# mariadb -e 'SET GLOBAL innodb_compression_algorithm=lzma;'
ERROR 1231 (42000) at line 1: Variable 'innodb_compression_algorithm' can't be set to the value of 'lzma'
root@e1cbb08df912:/etc/mysql# mariadb -e 'SET GLOBAL innodb_compression_algorithm=bzip2;'
ERROR 1231 (42000) at line 1: Variable 'innodb_compression_algorithm' can't be set to the value of 'bzip2'
root@e1cbb08df912:/etc/mysql# mariadb -e 'SET GLOBAL innodb_compression_algorithm=snappy;'
ERROR 1231 (42000) at line 1: Variable 'innodb_compression_algorithm' can't be set to the value of 'snappy'
 
 
root@e1cbb08df912:/etc/mysql# grep -E "(Compression)? supported:" /var/lib/mysql/#rocksdb/LOG
2020/06/06-08:50:16.909747 7f68390b5800 Compression algorithms supported:
2020/06/06-08:50:16.909748 7f68390b5800 	kZSTDNotFinalCompression supported: 0
2020/06/06-08:50:16.909749 7f68390b5800 	kZSTD supported: 0
2020/06/06-08:50:16.909750 7f68390b5800 	kXpressCompression supported: 0
2020/06/06-08:50:16.909751 7f68390b5800 	kLZ4HCCompression supported: 1
2020/06/06-08:50:16.909752 7f68390b5800 	kLZ4Compression supported: 1
2020/06/06-08:50:16.909753 7f68390b5800 	kBZip2Compression supported: 0
2020/06/06-08:50:16.909753 7f68390b5800 	kZlibCompression supported: 1
2020/06/06-08:50:16.909754 7f68390b5800 	kSnappyCompression supported: 1
2020/06/06-08:50:16.909756 7f68390b5800 Fast CRC32 supported: Supported on x86

Autopkgtest at https://salsa.debian.org/mariadb-team/mariadb-server/-/jobs/787208 also still failing.

Comment by Otto Kekäläinen [ 2020-06-06 ]

OK, I managed to solve above mentioned issues. PR available at https://github.com/MariaDB/server/pull/1582

Comment by Marko Mäkelä [ 2021-03-03 ]

I reiterate what I said on 2020-05-19: I do not think that we should introduce new file formats lightly. If we add a compression library to our distributed packages, there will be a significant additional cost for removing the code later. Users who enabled an algorithm would have to execute additional steps on an upgrade to a later version where we might want to remove that form of compression. And we would have to provide an upgrade tool for converting affected files.

To save us from such trouble, we should run some benchmarks beforehand and determine which library provides the best ratio between CPU usage and compression savings. I think that we will need two types of I/O bound benchmarks: MDEV-23399 style (large redo log, and the data does not completely fit in the buffer pool), and MDEV-23855 style (tiny redo log, frequent checkpoint flushing, while all data fits in the buffer pool). The former should involve both page reads and writes, and the latter should basically be write-only.

Later in 2020, I learned about thinly provisioned smart SSDs that would compress data on the fly. They present themselves as larger-than-real capacity. I think that with such storage, and with a configuration option that disables the hole-punching in InnoDB, the page_compressed tables could become a viable option. In that case, the files would be completely regular (not sparse) on the file system level.

Comment by Marko Mäkelä [ 2021-03-15 ]

Perhaps we should limit our offering to ZLIB and ZSTD. ZSTD is currently being used by RocksDB, but there is no InnoDB interface for it yet.

Comment by Marko Mäkelä [ 2021-03-24 ]

I noticed that MDEV-22310 has been filed for implementing ZSTD support, presumably in InnoDB. I think that we would definitely need a prototype of that before proceeding with benchmarks.

In my opinion, ideally we should not support more the than zlib and ZSTD. Only if benchmarks indicate that some other implementation offers significantly better compression ratio at a comparable CPU overhead, we could enable it.

Comment by Rob Schwyzer [ 2021-03-29 ]

In my opinion, ideally we should not support more the than zlib and ZSTD. Only if benchmarks indicate that some other implementation offers significantly better compression ratio at a comparable CPU overhead, we could enable it.

There is a strong argument for LZ4 as a faster algorithm-
https://www.percona.com/blog/2016/04/13/evaluating-database-compression-methods-update/

Ideally that would provide ZSTD and zlib for users prioritizing compression ratio, while LZ4 provides an option for getting ~50% of that ratio with a much smaller hit on performance (the real key for this is LZ4's massive advantage in decompression performance, so for customers who have workloads where writes are more upfront while reads are the heavier workload overall, this can be a very minimal performance hit while still reducing disk space usage by about 3x). Its compression performance is still much better than ZSTD and zlib as well so for the other use-case of customers who enable compression to buy time while they perform a major SAN upgrade or similar, LZ4 is more viable on the whole to enable that use-case as well.

Comment by Marko Mäkelä [ 2021-07-22 ]

I see that wlad expressed some skepticism towards ZSTD in MDEV-22310 (which was actually filed for the client/server communication protocol). It is true that we can enable support for LZ4 with trivial effort in our distributed executables, because the support is already present in the source code. We might implement and enable support for ZSTD in InnoDB later, if it turns out to be significantly better than other alternatives.

Comment by Marko Mäkelä [ 2023-12-07 ]

This was sort-of addressed in MDEV-12933.

Generated at Thu Feb 08 07:47:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.