Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25529

Auto-create: Pre-existing historical data is not partitioned as specified by ALTER

Details

    Description

      I have a system-versioned table with some historical data, e.g.

      MariaDB [test]> show create table t \G
      *************************** 1. row ***************************
             Table: t
      Create Table: CREATE TABLE `t` (
        `pk` int(11) NOT NULL,
        `a` int(11) DEFAULT NULL,
        PRIMARY KEY (`pk`)
      ) ENGINE=MyISAM DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
      1 row in set (0.000 sec)
       
      MariaDB [test]> select year(row_end) y, count(*) from t for system_time all group by y;
      +------+----------+
      | y    | count(*) |
      +------+----------+
      | 2019 |        8 |
      | 2020 |       12 |
      | 2021 |        4 |
      | 2038 |        2 |
      +------+----------+
      4 rows in set (0.001 sec)
      

      Now I want to make it auto-partitioned, so I run ALTER TABLE:

      MariaDB [test]> alter table t partition by system_time interval 1 year auto;
      Query OK, 26 rows affected (0.017 sec)             
      Records: 26  Duplicates: 0  Warnings: 0
      

      My expectation is that it will auto-create the partition structure suitable for the already existing historical data – that is, 2 or 3 historical partitions starting from the lowest row_end timestamp with 1 year interval. It doesn't happen; instead, the table gets 1 historical partition starting from the current time, and all historical data is put in there:

      bb-10.6-midenok-MDEV-17554 b96b96f9f

      MariaDB [test]> show create table t \G
      *************************** 1. row ***************************
             Table: t
      Create Table: CREATE TABLE `t` (
        `pk` int(11) NOT NULL,
        `a` int(11) DEFAULT NULL,
        PRIMARY KEY (`pk`)
      ) ENGINE=MyISAM DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
       PARTITION BY SYSTEM_TIME INTERVAL 1 YEAR STARTS TIMESTAMP'2021-04-26 00:00:00' AUTO
      PARTITIONS 2
      1 row in set (0.000 sec)
       
      MariaDB [test]> select count(*) from t partition (p0);
      +----------+
      | count(*) |
      +----------+
      |       24 |
      +----------+
      1 row in set (0.000 sec)
      

      It is even semantically wrong, 2019- and 2020-year data shouldn't be in a partition with start time in 2021.

      Admittedly non-auto partitioning would do it too, but it is somewhat natural to expect more intelligent action from "auto".

      With LIMIT instead of INTERVAL, sufficient partitions aren't created either, but at least with LIMIT there is no "semantically wrong" aspect, the partition simply gets overflowed, which can happen anyway.

      If the behavior is intentional, please document it (and in general, please make sure that the documentation says what happens upon ALTER TABLE .. PARTITION, as currently MDEV-17554 description only mentions CREATE TABLE and DML).

      Attachments

        Issue Links

          Activity

            I don't think it's even an auto-create problem. partitioning by time assumes that every row belongs in the corresponding partition. That is, auto-create or not, the table is, basically, corrupted. And alter table ... partition by... should not create corrupted tables.

            Without AUTO such an ALTER TABLE should, probably, just fail. Or, better, automatically define start time as the oldest, and only fail if there's explicit newer STARTS or an insufficient number of historical partitions.

            And with AUTO it should also try to auto-determine the STARTS time and auto-determine the number of partitions and only fail if newer STARTS is specified explicitly or if it'd had to auto-create too many partitions.

            serg Sergei Golubchik added a comment - I don't think it's even an auto-create problem. partitioning by time assumes that every row belongs in the corresponding partition. That is, auto-create or not, the table is, basically, corrupted. And alter table ... partition by... should not create corrupted tables. Without AUTO such an ALTER TABLE should, probably, just fail. Or, better, automatically define start time as the oldest, and only fail if there's explicit newer STARTS or an insufficient number of historical partitions. And with AUTO it should also try to auto-determine the STARTS time and auto-determine the number of partitions and only fail if newer STARTS is specified explicitly or if it'd had to auto-create too many partitions.

            serg SYSTEM_TIME partitioning like RANGE partitioning works by "less than" condition, so first history partition can include timeframe of any length back to the past. I assume this is not failure situation but the warning should be printed if the first partition accumulates more than INTERVAL timeframe.

            If STARTS was not specified it can be auto-detected. If there is no index on row_end the simplest algorithm is to scan the table to detect MIN(row_end). But since we roundup interval endpoints we may avoid extra table scan and fill the partitions in arbitrary manner. Then rename partitions to correct order.

            midenok Aleksey Midenkov added a comment - serg SYSTEM_TIME partitioning like RANGE partitioning works by "less than" condition, so first history partition can include timeframe of any length back to the past. I assume this is not failure situation but the warning should be printed if the first partition accumulates more than INTERVAL timeframe. If STARTS was not specified it can be auto-detected. If there is no index on row_end the simplest algorithm is to scan the table to detect MIN(row_end). But since we roundup interval endpoints we may avoid extra table scan and fill the partitions in arbitrary manner. Then rename partitions to correct order.

            SYSTEM_TIME partitioning with INTERVAL works by having the first partition storing the history from start_date to start_date + interval, the second partition from start_date + interval to start_date + 2*interval, etc. So the first partition cannot have "anything earlier than", it'll break the partition splitting logic.

            serg Sergei Golubchik added a comment - SYSTEM_TIME partitioning with INTERVAL works by having the first partition storing the history from start_date to start_date + interval, the second partition from start_date + interval to start_date + 2*interval, etc. So the first partition cannot have "anything earlier than", it'll break the partition splitting logic.

            serg Can you show an example where "partition splitting logic" is broken?

            midenok Aleksey Midenkov added a comment - serg Can you show an example where "partition splitting logic" is broken?

            What I meant was that for INTERVAL partitions, the i'th partition is supposed to have history in the range [start_date + (i-1) * interval ; start_date + i * interval]. And partition pruning uses it.

            But as far as I can see, partition pruning only checks the upper boundary of every partition range, so if the first partition will have rows from before start_time it will not break partition pruning. At least for now.

            Okay. then it's safe, you're right.

            serg Sergei Golubchik added a comment - What I meant was that for INTERVAL partitions, the i 'th partition is supposed to have history in the range [start_date + (i-1) * interval ; start_date + i * interval] . And partition pruning uses it. But as far as I can see, partition pruning only checks the upper boundary of every partition range, so if the first partition will have rows from before start_time it will not break partition pruning. At least for now. Okay. then it's safe, you're right.

            People

              midenok Aleksey Midenkov
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.