[MDEV-11068] Review which innodb_compression_algorithm to support in binary packages Created: 2016-10-17 Updated: 2023-12-12 Resolved: 2023-12-12 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Packaging, Storage Engine - InnoDB |
| Fix Version/s: | N/A |
| Type: | New Feature | Priority: | Major |
| Reporter: | Valerii Kravchuk | Assignee: | Sergei Golubchik |
| Resolution: | Fixed | Votes: | 4 |
| Labels: | Compatibility, compression, packaging, performance | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
It seems RPMs for 10.1.x come with only LZMA of all possible compression algorithms. It is suggested to build from source when other algorithms are needed (see https://mariadb.com/kb/en/mariadb/compression/). On some systems (like RHEL) that we provide packages for some of these algorithms (like LZ4) are available as packages. I think it makes sense (maybe only for Enterprise binaries?) to build with them added and then add dependencies for the related packages. - |
| Comments |
| Comment by Arnaud Adant [ 2016-10-21 ] | ||||||||||||||||||||||||||
|
Please note that zlib is also available by default but it would be nice to have the other ones to compare. | ||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2017-05-27 ] | ||||||||||||||||||||||||||
|
We cannot add new dependencies to rpm packages after GA. | ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-03-22 ] | ||||||||||||||||||||||||||
|
I do not think that it makes sense to enable bzip2 at all. It has a very large memory footprint, and it is designed for compressing much larger input than the innodb_page_size blocks (default 16384 bytes). | ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-04-21 ] | ||||||||||||||||||||||||||
|
I think that we will need some benchmarks to check not only which file systems work well with page_compressed tables, but also the efficiency of different compression algorithms (in terms of CPU usage and saved storage space). The benchmark effort may have to wait for MDEV-11916 and | ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-04-23 ] | ||||||||||||||||||||||||||
|
I think that of all the innodb_compression_algorithm that are implemented in the source code, bzip2 is worst match for InnoDB. The reason is that the ‘input files’ are individual InnoDB pages, at most innodb_page_size bytes of payload to compress. That is, 4KiB, 8KiB, 16KiB, 32KiB, or 64KiB. But, the bzip2 input block size is 1 MiB. The encoder of the compressed data stream could waste some code space for representing longer lengths that would never occur in our use case. Also, the memory usage of bzip2 could be huge. Even the bzip2 --small option is consuming up to 2.5 bytes per input byte. Maybe we could at least agree to remove bzip2 support in some version? 10.6? If zlib is soon to support SIMD-based optimizations for modern CPUs, maybe the practical advantage of other implementations of the Lempel-Ziv 1977 and 1978 algorithms will be reduced? Note: If we ever enable some compression algorithm in our distributed binary packages, then I am afraid that it will be very hard to remove those algorithms later, because users would complain that their data is inaccessible after an upgrade. This could be addressed by creating an external tool that would convert data files, but then that tool would have to depend on all those compression libraries ‘forever’. | ||||||||||||||||||||||||||
| Comment by Otto Kekäläinen [ 2020-04-26 ] | ||||||||||||||||||||||||||
|
I did not spot any compression changes yet in recent 10.5 commits, but the autopkgtest at https://salsa.debian.org/mariadb-team/mariadb-server/-/jobs/692724 started failing with `ERROR 1231 (42000) at line 2: Variable 'innodb_compression_algorithm' can't be set to the value of 'snappy'`. Did you decide to remove snappy already or shall I investigate this as a regression? | ||||||||||||||||||||||||||
| Comment by Otto Kekäläinen [ 2020-05-14 ] | ||||||||||||||||||||||||||
|
Just a reminder that the regression on the 10.5 branch still exists ^. Value 'snappy' for variable `innodb_compression_algorithm` has stopped being a recognized one. It is rather annoying in my otherwise pretty CI pipeline at https://salsa.debian.org/mariadb-team/mariadb-server/pipelines/136485 | ||||||||||||||||||||||||||
| Comment by Otto Kekäläinen [ 2020-05-19 ] | ||||||||||||||||||||||||||
|
Regarding the two comments above, I figured out the snappy and RocksDB failures in debian/test/smoke and will soon submit a PR about them. Note that in addition to the options listed at https://mariadb.com/kb/en/innodb-page-compression/#configuring-the-innodb-page-compression-algorithm there is also zstd as it is used by RocksDB. For binaries published in Debian/Ubuntu officially we have only zlib in Ubuntu Bionic (MariaDB 10.1) and both zlib, lz4 and snappy in Ubuntu Focal (MariaDB 10.3). Snappy was enabled in https://salsa.debian.org/mariadb-team/mariadb-10.3/-/commit/278531a7dfa7d60a60b067d089860c92a4e1221b - was this an OK decision? If somebody wants to test this, this can be quickly copy-pasted:
| ||||||||||||||||||||||||||
| Comment by Otto Kekäläinen [ 2020-05-19 ] | ||||||||||||||||||||||||||
|
For the record, RocksDB compression status on current 10.5 branch:
| ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-05-19 ] | ||||||||||||||||||||||||||
|
otto, thank you for investigating this. I think that we should try to remove the ‘useless’ algortihms (determined by benchmarking) and extend innochecksum (or create a separate tool) to allow re-encoding data files from any previously supported page_compressed algorithm into the supported ones. | ||||||||||||||||||||||||||
| Comment by Otto Kekäläinen [ 2020-06-06 ] | ||||||||||||||||||||||||||
|
Related commit https://github.com/MariaDB/server/commit/6af37ba881fee7e6f651d5e0730c9374337ad1b4 by serg Did not seem to have had an effect on this. Outputs from latest 10.5 at the time of writing:
Autopkgtest at https://salsa.debian.org/mariadb-team/mariadb-server/-/jobs/787208 also still failing. | ||||||||||||||||||||||||||
| Comment by Otto Kekäläinen [ 2020-06-06 ] | ||||||||||||||||||||||||||
|
OK, I managed to solve above mentioned issues. PR available at https://github.com/MariaDB/server/pull/1582 | ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-03-03 ] | ||||||||||||||||||||||||||
|
I reiterate what I said on 2020-05-19: I do not think that we should introduce new file formats lightly. If we add a compression library to our distributed packages, there will be a significant additional cost for removing the code later. Users who enabled an algorithm would have to execute additional steps on an upgrade to a later version where we might want to remove that form of compression. And we would have to provide an upgrade tool for converting affected files. To save us from such trouble, we should run some benchmarks beforehand and determine which library provides the best ratio between CPU usage and compression savings. I think that we will need two types of I/O bound benchmarks: Later in 2020, I learned about thinly provisioned smart SSDs that would compress data on the fly. They present themselves as larger-than-real capacity. I think that with such storage, and with a configuration option that disables the hole-punching in InnoDB, the page_compressed tables could become a viable option. In that case, the files would be completely regular (not sparse) on the file system level. | ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-03-15 ] | ||||||||||||||||||||||||||
|
Perhaps we should limit our offering to ZLIB and ZSTD. ZSTD is currently being used by RocksDB, but there is no InnoDB interface for it yet. | ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-03-24 ] | ||||||||||||||||||||||||||
|
I noticed that MDEV-22310 has been filed for implementing ZSTD support, presumably in InnoDB. I think that we would definitely need a prototype of that before proceeding with benchmarks. In my opinion, ideally we should not support more the than zlib and ZSTD. Only if benchmarks indicate that some other implementation offers significantly better compression ratio at a comparable CPU overhead, we could enable it. | ||||||||||||||||||||||||||
| Comment by Rob Schwyzer [ 2021-03-29 ] | ||||||||||||||||||||||||||
There is a strong argument for LZ4 as a faster algorithm- Ideally that would provide ZSTD and zlib for users prioritizing compression ratio, while LZ4 provides an option for getting ~50% of that ratio with a much smaller hit on performance (the real key for this is LZ4's massive advantage in decompression performance, so for customers who have workloads where writes are more upfront while reads are the heavier workload overall, this can be a very minimal performance hit while still reducing disk space usage by about 3x). Its compression performance is still much better than ZSTD and zlib as well so for the other use-case of customers who enable compression to buy time while they perform a major SAN upgrade or similar, LZ4 is more viable on the whole to enable that use-case as well. | ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-07-22 ] | ||||||||||||||||||||||||||
|
I see that wlad expressed some skepticism towards ZSTD in MDEV-22310 (which was actually filed for the client/server communication protocol). It is true that we can enable support for LZ4 with trivial effort in our distributed executables, because the support is already present in the source code. We might implement and enable support for ZSTD in InnoDB later, if it turns out to be significantly better than other alternatives. | ||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-12-07 ] | ||||||||||||||||||||||||||
|
This was sort-of addressed in |