Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
12.3
-
Unexpected results
-
Q1/2026 Server Maintenance
Description
The old hash algorithm had a mistake in its design for numeric columns.
Numeric columns derived hash_not_null() from the top level Field:
void Field::hash_not_null(Hasher *hasher) |
{
|
DBUG_ASSERT(marked_for_read());
|
DBUG_ASSERT(!is_null());
|
hasher->add(sort_charset(), ptr, pack_length());
|
}
|
sort_charset() returns my_charset_latin1 for numeric columns.
So for example, Field_long re-interprets its buffer containing a binary number as a character string with:
- length 4
- collation latin1_swedish_ci
As a result, the hash is calculated doing the following transformation:
- All bytes 0x20 are truncated from the "string"
- Lower case "characters" are converted to their upper case counter parts
The new hash algorithms (MDEV-9826) derive this wrong behavior. This scripts with CRC32C demonstrates the problem:
CREATE OR REPLACE TABLE t1 (c1 INT NOT NULL) PARTITION BY KEY ALGORITHM=CRC32C (c1) PARTITIONS 16; |
INSERT INTO t1 VALUES (0x41),(0x61); |
SELECT table_name, partition_name, table_rows |
FROM information_schema.partitions |
WHERE table_name='t1' AND table_rows>0; |
+------------+----------------+------------+
|
| table_name | partition_name | table_rows |
|
+------------+----------------+------------+
|
| t1 | p8 | 2 |
|
+------------+----------------+------------+
|
Notice, both records got written to the same partition.
This behavior should be fixed for numeric data types for new algorithms.
It seems we'll have to add a new virtual function to hash numbers into my_hasher_st.
In the old algorithm (my_hasher_mysql5x) the numeric hashing function should still use my_charset_latin1 for backward compatibility.
In the new algorithms the numeric hashing functions should use my_charset_bin.
Attachments
Issue Links
- blocks
-
MDEV-38394 MDEV-9826: ~2-3% Performance regression upon partion INSERTs when using the default partitioning hashing algorithm
-
- Closed
-
- is caused by
-
MDEV-9826 better hash algorithms for PARTITION BY KEY
-
- In Testing
-