[MDEV-25529] Auto-create: Pre-existing historical data is not partitioned as specified by ALTER Created: 2021-04-26 Updated: 2023-11-28 |
|
| Status: | Stalled |
| Project: | MariaDB Server |
| Component/s: | Partitioning, Versioned Tables |
| Affects Version/s: | 10.9 |
| Fix Version/s: | 10.11 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Aleksey Midenkov |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Description |
|
I have a system-versioned table with some historical data, e.g.
Now I want to make it auto-partitioned, so I run ALTER TABLE:
My expectation is that it will auto-create the partition structure suitable for the already existing historical data – that is, 2 or 3 historical partitions starting from the lowest row_end timestamp with 1 year interval. It doesn't happen; instead, the table gets 1 historical partition starting from the current time, and all historical data is put in there:
It is even semantically wrong, 2019- and 2020-year data shouldn't be in a partition with start time in 2021. Admittedly non-auto partitioning would do it too, but it is somewhat natural to expect more intelligent action from "auto". With LIMIT instead of INTERVAL, sufficient partitions aren't created either, but at least with LIMIT there is no "semantically wrong" aspect, the partition simply gets overflowed, which can happen anyway. If the behavior is intentional, please document it (and in general, please make sure that the documentation says what happens upon ALTER TABLE .. PARTITION, as currently |
| Comments |
| Comment by Sergei Golubchik [ 2021-04-28 ] |
|
I don't think it's even an auto-create problem. partitioning by time assumes that every row belongs in the corresponding partition. That is, auto-create or not, the table is, basically, corrupted. And alter table ... partition by... should not create corrupted tables. Without AUTO such an ALTER TABLE should, probably, just fail. Or, better, automatically define start time as the oldest, and only fail if there's explicit newer STARTS or an insufficient number of historical partitions. And with AUTO it should also try to auto-determine the STARTS time and auto-determine the number of partitions and only fail if newer STARTS is specified explicitly or if it'd had to auto-create too many partitions. |
| Comment by Aleksey Midenkov [ 2021-12-27 ] |
|
serg SYSTEM_TIME partitioning like RANGE partitioning works by "less than" condition, so first history partition can include timeframe of any length back to the past. I assume this is not failure situation but the warning should be printed if the first partition accumulates more than INTERVAL timeframe. If STARTS was not specified it can be auto-detected. If there is no index on row_end the simplest algorithm is to scan the table to detect MIN(row_end). But since we roundup interval endpoints we may avoid extra table scan and fill the partitions in arbitrary manner. Then rename partitions to correct order. |
| Comment by Sergei Golubchik [ 2022-01-03 ] |
|
SYSTEM_TIME partitioning with INTERVAL works by having the first partition storing the history from start_date to start_date + interval, the second partition from start_date + interval to start_date + 2*interval, etc. So the first partition cannot have "anything earlier than", it'll break the partition splitting logic. |
| Comment by Aleksey Midenkov [ 2022-01-10 ] |
|
serg Can you show an example where "partition splitting logic" is broken? |
| Comment by Sergei Golubchik [ 2022-01-11 ] |
|
What I meant was that for INTERVAL partitions, the i'th partition is supposed to have history in the range [start_date + (i-1) * interval ; start_date + i * interval]. And partition pruning uses it. But as far as I can see, partition pruning only checks the upper boundary of every partition range, so if the first partition will have rows from before start_time it will not break partition pruning. At least for now. Okay. then it's safe, you're right. |