Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11068

Review which innodb_compression_algorithm to support in binary packages

Details

    Description

      It seems RPMs for 10.1.x come with only LZMA of all possible compression algorithms. It is suggested to build from source when other algorithms are needed (see https://mariadb.com/kb/en/mariadb/compression/).

      On some systems (like RHEL) that we provide packages for some of these algorithms (like LZ4) are available as packages. I think it makes sense (maybe only for Enterprise binaries?) to build with them added and then add dependencies for the related packages.






      -
      But we cannot add new dependencies to rpm packages after GA. And we should not introduce new file formats lightly. If we add a compression library to our distributed packages, there will be a significant additional cost for removing the code later. Users who enabled an algorithm would have to execute additional steps on an upgrade to a later version where we might want to remove that form of compression. And we would have to provide an upgrade tool for converting affected files. To save us from such trouble, we should run some benchmarks beforehand and determine which library provides the best ratio between CPU usage and compression savings.

      Attachments

        Issue Links

          Activity

            Perhaps we should limit our offering to ZLIB and ZSTD. ZSTD is currently being used by RocksDB, but there is no InnoDB interface for it yet.

            marko Marko Mäkelä added a comment - Perhaps we should limit our offering to ZLIB and ZSTD. ZSTD is currently being used by RocksDB, but there is no InnoDB interface for it yet.

            I noticed that MDEV-22310 has been filed for implementing ZSTD support, presumably in InnoDB. I think that we would definitely need a prototype of that before proceeding with benchmarks.

            In my opinion, ideally we should not support more the than zlib and ZSTD. Only if benchmarks indicate that some other implementation offers significantly better compression ratio at a comparable CPU overhead, we could enable it.

            marko Marko Mäkelä added a comment - I noticed that MDEV-22310 has been filed for implementing ZSTD support, presumably in InnoDB. I think that we would definitely need a prototype of that before proceeding with benchmarks. In my opinion, ideally we should not support more the than zlib and ZSTD . Only if benchmarks indicate that some other implementation offers significantly better compression ratio at a comparable CPU overhead, we could enable it.
            rob.schwyzer@mariadb.com Rob Schwyzer added a comment - - edited

            In my opinion, ideally we should not support more the than zlib and ZSTD. Only if benchmarks indicate that some other implementation offers significantly better compression ratio at a comparable CPU overhead, we could enable it.

            There is a strong argument for LZ4 as a faster algorithm-
            https://www.percona.com/blog/2016/04/13/evaluating-database-compression-methods-update/

            Ideally that would provide ZSTD and zlib for users prioritizing compression ratio, while LZ4 provides an option for getting ~50% of that ratio with a much smaller hit on performance (the real key for this is LZ4's massive advantage in decompression performance, so for customers who have workloads where writes are more upfront while reads are the heavier workload overall, this can be a very minimal performance hit while still reducing disk space usage by about 3x). Its compression performance is still much better than ZSTD and zlib as well so for the other use-case of customers who enable compression to buy time while they perform a major SAN upgrade or similar, LZ4 is more viable on the whole to enable that use-case as well.

            rob.schwyzer@mariadb.com Rob Schwyzer added a comment - - edited In my opinion, ideally we should not support more the than zlib and ZSTD. Only if benchmarks indicate that some other implementation offers significantly better compression ratio at a comparable CPU overhead, we could enable it. There is a strong argument for LZ4 as a faster algorithm- https://www.percona.com/blog/2016/04/13/evaluating-database-compression-methods-update/ Ideally that would provide ZSTD and zlib for users prioritizing compression ratio, while LZ4 provides an option for getting ~50% of that ratio with a much smaller hit on performance (the real key for this is LZ4's massive advantage in decompression performance, so for customers who have workloads where writes are more upfront while reads are the heavier workload overall, this can be a very minimal performance hit while still reducing disk space usage by about 3x). Its compression performance is still much better than ZSTD and zlib as well so for the other use-case of customers who enable compression to buy time while they perform a major SAN upgrade or similar, LZ4 is more viable on the whole to enable that use-case as well.

            I see that wlad expressed some skepticism towards ZSTD in MDEV-22310 (which was actually filed for the client/server communication protocol). It is true that we can enable support for LZ4 with trivial effort in our distributed executables, because the support is already present in the source code. We might implement and enable support for ZSTD in InnoDB later, if it turns out to be significantly better than other alternatives.

            marko Marko Mäkelä added a comment - I see that wlad expressed some skepticism towards ZSTD in MDEV-22310 (which was actually filed for the client/server communication protocol). It is true that we can enable support for LZ4 with trivial effort in our distributed executables, because the support is already present in the source code. We might implement and enable support for ZSTD in InnoDB later, if it turns out to be significantly better than other alternatives.

            This was sort-of addressed in MDEV-12933.

            marko Marko Mäkelä added a comment - This was sort-of addressed in MDEV-12933 .

            People

              serg Sergei Golubchik
              valerii Valerii Kravchuk
              Votes:
              4 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.