[MDEV-9826] KEY partitioning results in uneven data distribution - bad hash algorithm Created: 2016-03-29  Updated: 2019-09-17

Status: Open
Project: MariaDB Server
Component/s: Partitioning
Fix Version/s: None

Type: Task Priority: Major
Reporter: Valerii Kravchuk Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Test case from https://bugs.launchpad.net/percona-server/+bug/1348295 gives the same result (uneven data disctribution) on recent MariaDB 10.1.13:

[openxs@fc23 maria10.1]$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 10.1.13-MariaDB Source distribution
 
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test]> drop table t1;
Query OK, 0 rows affected (0.18 sec)
 
MariaDB [test]> create table t1(c1 int, c2 date) partition by key(c2) partitions 8;
Query OK, 0 rows affected (1.92 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-22');
Query OK, 1 row affected (0.04 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-23');
Query OK, 1 row affected (0.04 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-24');
Query OK, 1 row affected (0.04 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-25');
Query OK, 1 row affected (0.04 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-26');
Query OK, 1 row affected (0.15 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-27');
Query OK, 1 row affected (0.05 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-28');
Query OK, 1 row affected (0.04 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-29');
Query OK, 1 row affected (0.12 sec)
 
MariaDB [test]> insert into t1 values (1, '2014-04-30');
Query OK, 1 row affected (0.07 sec)
 
MariaDB [test]> select partition_name, table_rows from information_schema.partitions where table_name='t1';
+----------------+------------+
| partition_name | table_rows |
+----------------+------------+
| p0             |          3 |
| p1             |          0 |
| p2             |          0 |
| p3             |          0 |
| p4             |          2 |
| p5             |          0 |
| p6             |          4 |
| p7             |          0 |
+----------------+------------+
8 rows in set (0.00 sec)
 
MariaDB [test]>

No wonder, as this is upstream bug known also as:

https://bugs.mysql.com/bug.php?id=72428
https://bugs.mysql.com/bug.php?id=65112

Percona set the bug to "Won't fix" and Oracle seems to postpone fixing as the fix requires changing of hash algorithm and may result in backward compatibility issues.

I wonder if mariaDB can implement new hashing algorithm for KEY partitioning (maybe under some new "name"/partition type) in 10.1 or 10.2, one that distributes data in a more even/expected way?



 Comments   
Comment by Sergei Golubchik [ 2017-05-29 ]

It's not too difficult to add a new hashing algorithm and store it in the frm. This way we can change hashing later as we want or even let the user do it.

Comment by Sergei Golubchik [ 2018-09-07 ]

Note, that we'll need some kind of SQL syntax for that, so that SHOW CREATE TABLE would show the hashing algorithm and that one would be able to dump/restore the table.

Generated at Thu Feb 08 07:37:37 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.