[MDEV-19935] Create unified CRC-32 interface Created: 2019-07-03 Updated: 2021-12-09 Resolved: 2020-09-17 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | mariabackup, Server, Storage Engine - InnoDB, Storage Engine - RocksDB |
| Fix Version/s: | 10.5.7 |
| Type: | Task | Priority: | Major |
| Reporter: | Marko Mäkelä | Assignee: | Vladislav Vaintroub |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | checksum, performance, portability | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Description |
|
The MariaDB code base contains quite a few implementations of CRC-32, some (or all?) of them using the CRC-32C polynomial. It would be good to create a uniform interface and remove any code duplication.
At the very least, we should have a common CRC-32C implementation on all platforms and remove ut0crc32.cc from InnoDB code base. (Maybe it is not worth touching the code in the bundled zlib.) If other CRC-32 polynomials are needed, then we should define a common interface for those as well. |
| Comments |
| Comment by Marko Mäkelä [ 2020-05-29 ] | |||||||||||||||||
|
An unified interface (with acceleration) for the zlib crc32() function will be introduced by After that, what remains to be done (in this task) is unifying the interface to CRC-32C (using the Castagnoli polynomial), which is used by MyRocks (RocksDB) and InnoDB. The RocksDB implementation is superior to InnoDB’s, because it can make use of the pclmul instruction, which apparently can outperform the SSE4.2 crc32 instructions that InnoDB is using. The run-time check for the pclmul instruction should use the | |||||||||||||||||
| Comment by Vladislav Vaintroub [ 2020-05-29 ] | |||||||||||||||||
|
MyRocks uses both CRC32 instruction and pclmul , as described in https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/crc-iscsi-polynomial-crc32-instruction-paper.pdf | |||||||||||||||||
| Comment by Marko Mäkelä [ 2020-08-27 ] | |||||||||||||||||
|
The following works at least starting with GCC 4.8.2 and clang 4.0.0. Note: I am only using this for illustration purposes; I am not suggesting to use any GCC-style built-in functions:
I think that using the target attribute is preferred to compiling the entire compilation unit with special flags. We do not want the compiler to accidentally use some SIMD instructions for unrelated parts of the code. This technique would also allow us to keep all code for different ISA dialects in the same compilation unit. Side note: defining
seems to be a dead end, because it is not supported on even the newest clang, and only supported starting with GCC 6. | |||||||||||||||||
| Comment by Marko Mäkelä [ 2020-08-27 ] | |||||||||||||||||
|
Side note: The Galera library appears to include its own implementations:
| |||||||||||||||||
| Comment by Vladislav Vaintroub [ 2020-09-17 ] | |||||||||||||||||
|
after talking to marko, decided to push into 10.5, as the revised CRC32C implementation (CRC32+PCLMULQDQ) also gives nice speedup on for innodb checksum calculation on x64. Nice speedup amounts to something like 2x faster, in my benchmarks with aligned 16K pages |