[MDEV-16880] Provide checksum aggregate functions, and partition-level checksums Created: 2018-08-02  Updated: 2023-08-01

Status: Open
Project: MariaDB Server
Component/s: Server
Fix Version/s: None

Type: Task Priority: Major
Reporter: Eric Herman Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-16249 CHECKSUM TABLE for a spider table is ... Closed
relates to MDEV-16520 Out-Of-Memory running big aggregate q... Closed

 Description   

Today, we only have the option to checksum a table:

CHECKSUM TABLE tbl_name [, tbl_name] ... [ QUICK | EXTENDED ]

However, it would be useful to be able to specify a subset of data, especially a single partition.

Additionally, it would be best if it would be possible to get a checksum of data in an un-partitioned table, then copy this data into a partitioned table, aggregate the checksum values of each partition, and get the same value as un-partitioned table.

( Relates to: MDEV-16249 )



 Comments   
Comment by Eric Herman [ 2018-08-02 ]

I can imagine a new syntax something like extending the MD5 aggregate functions to take multiple columns might be useful.

I can imagine adding syntax like

CHECKSUM TABLE tbl_name [PARTITION BY partition_options] [, tbl_name ]

might be useful for non-partitioned tables, as it would allow for parallel checksum calculation, even in the non-partitioned case.

Already there exists some partition CHECKSUM data in information_schema. https://mariadb.com/kb/en/library/information-schema-partitions-table/

Comment by Sergei Golubchik [ 2018-08-08 ]

There's already a syntax for a table partition, like (copied from tests):

SELECT * FROM t2 PARTITION (foo);
DELETE t1, t2 FROM t1 PARTITION (pNeg), t3, t2 PARTITION (p1)
INSERT INTO t2 PARTITION (subp3) SELECT ...

So syntax-wise it would be quite logical to use that for CHECKSUM TABLE.

Generated at Thu Feb 08 08:32:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.