I use a bi-temp table (application time period and system time versioning) with an autoincrement id. If I update a record with "FOR PORTION OF" the part of id in primary key changes automatically. I think, this isn't correct. The id expresses the associated business fact.
Example and reproduction:
Starting Position:
Create the table:
CREATETABLE `t1` (
`id` bigint(20) unsigned NOTNULL AUTO_INCREMENT,
`price_for` varchar(100) DEFAULTNULL,
`price` integerDEFAULTNULL,
`date_1` dateNOTNULL,
`date_2` dateNOTNULL,
`row_start` timestamp(6) GENERATED ALWAYS AS ROW START,
`row_end` timestamp(6) GENERATED ALWAYS AS ROW END,
PRIMARYKEY (`id`,`date_2`,`row_start`,`row_end`),
PERIOD FOR SYSTEM_TIME (`row_start`, `row_end`),
PERIOD FOR `application_time` (`date_1`, `date_2`)
) WITH SYSTEM VERSIONING
Enter one record, result
Behaviour of using FOR PORTION OF:
Update query:
update t1 for portion of application_time
from'2021-04-01'to'9999-12-31'
set id=1, price = 6000
where id=1
Given result:
Line 1 is old versiones record as expected. Line 3 is new price record under id = 1 valid from 2021-04-01 as expected. But the old price which was valid between 2021-01-01 and 2021-04-01 has been stored automatically under id = 2 instead of id = 1 (it's still the price for procuct 1)
Think what would happen if this table had a related child table. The fk reference is broken.
I use a bi-temp table (application time period and system time versioning) with an autoincrement id. If I update a record with "FOR PORTION OF" the part of id in primary key changes automatically. I think, this isn't correct. The id expresses the associated business fact.
*Example and reproduction:*
Starting Position:
Create the table:
{code:SQL}
CREATE TABLE `t1` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`price_for` varchar(100) DEFAULT NULL,
`price` integer DEFAULT NULL,
`date_1` date NOT NULL,
`date_2` date NOT NULL,
`row_start` timestamp(6) GENERATED ALWAYS AS ROW START,
`row_end` timestamp(6) GENERATED ALWAYS AS ROW END,
PRIMARY KEY (`id`,`date_2`,`row_start`,`row_end`),
PERIOD FOR SYSTEM_TIME (`row_start`, `row_end`),
PERIOD FOR `application_time` (`date_1`, `date_2`)
) WITH SYSTEM VERSIONING
{code}
Enter one record, result
!Starting_point.png!
*Behaviour of using FOR PORTION OF:*
Update query:
{code:SQL}
update t1 for portion of application_time
from '2021-04-01' to '9999-12-31'
set id=1, price = 6000
where id=1
{code}
Given result:
!given_result.png!
Line 1 is old versiones record as expected. Line 3 is new price record under id = 1 valid from 2021-04-01 as expected. But the old price which was valid between 2021-01-01 and 2021-04-01 has been stored automatically under id = 2 instead of id = 1 (it's still the price for procuct 1)
Think what would happen if this table had a related child table. The fk reference is broken.
*Expected result:*
Query to produce it manually:
{code:SQL}
update t2
set id = 1, date_2 = '2021-04-01'
where id = 1;
Now, the same business fact has a correct application time line and the change is versioned correctly.
I use a bi-temp table (application time period and system time versioning) with an autoincrement id. If I update a record with "FOR PORTION OF" the part of id in primary key changes automatically. I think, this isn't correct. The id expresses the associated business fact.
*Example and reproduction:*
Starting Position:
Create the table:
{code:sql}
CREATE TABLE `t1` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`price_for` varchar(100) DEFAULT NULL,
`price` integer DEFAULT NULL,
`date_1` date NOT NULL,
`date_2` date NOT NULL,
`row_start` timestamp(6) GENERATED ALWAYS AS ROW START,
`row_end` timestamp(6) GENERATED ALWAYS AS ROW END,
PRIMARY KEY (`id`,`date_2`,`row_start`,`row_end`),
PERIOD FOR SYSTEM_TIME (`row_start`, `row_end`),
PERIOD FOR `application_time` (`date_1`, `date_2`)
) WITH SYSTEM VERSIONING
{code}
Enter one record, result
!Starting_point.png!
*Behaviour of using FOR PORTION OF:*
Update query:
{code:SQL}
update t1 for portion of application_time
from '2021-04-01' to '9999-12-31'
set id=1, price = 6000
where id=1
{code}
Given result:
!given_result.png!
Line 1 is old versiones record as expected. Line 3 is new price record under id = 1 valid from 2021-04-01 as expected. But the old price which was valid between 2021-01-01 and 2021-04-01 has been stored automatically under id = 2 instead of id = 1 (it's still the price for procuct 1)
Think what would happen if this table had a related child table. The fk reference is broken.
*Expected result:*
Query to produce it manually:
{code:SQL}
update t2
set id = 1, date_2 = '2021-04-01'
where id = 1;
I use a bi-temp table (application time period and system time versioning) with an autoincrement id. If I update a record with "FOR PORTION OF" the part of id in primary key changes automatically. I think, this isn't correct. The id expresses the associated business fact.
*Example and reproduction:*
Starting Position:
Create the table:
{code:sql}
CREATE TABLE `t1` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`price_for` varchar(100) DEFAULT NULL,
`price` integer DEFAULT NULL,
`date_1` date NOT NULL,
`date_2` date NOT NULL,
`row_start` timestamp(6) GENERATED ALWAYS AS ROW START,
`row_end` timestamp(6) GENERATED ALWAYS AS ROW END,
PRIMARY KEY (`id`,`date_2`,`row_start`,`row_end`),
PERIOD FOR SYSTEM_TIME (`row_start`, `row_end`),
PERIOD FOR `application_time` (`date_1`, `date_2`)
) WITH SYSTEM VERSIONING
{code}
Enter one record, result
!Starting_point.png!
*Behaviour of using FOR PORTION OF:*
Update query:
{code:SQL}
update t1 for portion of application_time
from '2021-04-01' to '9999-12-31'
set id=1, price = 6000
where id=1
{code}
Given result:
!given_result.png!
Line 1 is old versiones record as expected. Line 3 is new price record under id = 1 valid from 2021-04-01 as expected. But the old price which was valid between 2021-01-01 and 2021-04-01 has been stored automatically under id = 2 instead of id = 1 (it's still the price for procuct 1)
Think what would happen if this table had a related child table. The fk reference is broken.
*Expected result:*
Query to produce it manually:
{code:SQL}
update t2
set id = 1, date_2 = '2021-04-01'
where id = 1;
Now, the same business fact has a correct application time line and the change is versioned correctly.
I use a bi-temp table (application time period and system time versioning) with an autoincrement id. If I update a record with "FOR PORTION OF" the part of id in primary key changes automatically. I think, this isn't correct. The id expresses the associated business fact.
*Example and reproduction:*
Starting Position:
Create the table:
{code:sql}
CREATE TABLE `t1` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`price_for` varchar(100) DEFAULT NULL,
`price` integer DEFAULT NULL,
`date_1` date NOT NULL,
`date_2` date NOT NULL,
`row_start` timestamp(6) GENERATED ALWAYS AS ROW START,
`row_end` timestamp(6) GENERATED ALWAYS AS ROW END,
PRIMARY KEY (`id`,`date_2`,`row_start`,`row_end`),
PERIOD FOR SYSTEM_TIME (`row_start`, `row_end`),
PERIOD FOR `application_time` (`date_1`, `date_2`)
) WITH SYSTEM VERSIONING
{code}
Enter one record, result
!Starting_point.png!
*Behaviour of using FOR PORTION OF:*
Update query:
{code:sql}
update t1 for portion of application_time
from '2021-04-01' to '9999-12-31'
set id=1, price = 6000
where id=1
{code}
Given result:
!given_result.png!
Line 1 is old versiones record as expected. Line 3 is new price record under id = 1 valid from 2021-04-01 as expected. But the old price which was valid between 2021-01-01 and 2021-04-01 has been stored automatically under id = 2 instead of id = 1 (it's still the price for procuct 1)
Think what would happen if this table had a related child table. The fk reference is broken.
*Expected result:*
Query to produce it manually:
{code:sql}
update t2
set id = 1, date_2 = '2021-04-01'
where id = 1;
PRIMARYKEY (`id`, `application_time` WITHOUT OVERLAPS),
but this doesn't help, new auto-inc value is still generated.
Sergei Golubchik
added a comment - I'd expect the correct key definition should be
PRIMARY KEY (`id`, `application_time` WITHOUT OVERLAPS),
but this doesn't help, new auto-inc value is still generated.
serg:
Yes, I expected the same. I tried the same. But it's even worse. As you can see in my post I had to extend the key with "row_start". Normally MariaDB adds the "row_end" automatically without "row_start". But in this case I get a Primary Key violation error. Theoretically there shouldn't be one.
Enrica Ruedin
added a comment - - edited serg :
Yes, I expected the same. I tried the same. But it's even worse. As you can see in my post I had to extend the key with "row_start". Normally MariaDB adds the "row_end" automatically without "row_start". But in this case I get a Primary Key violation error. Theoretically there shouldn't be one.
If you've tried different variants of your table definition and they delivered unexpected results — please, mention everything that you tried here. Perhaps that all also can be fixed as a part of this bug, if the underlying code problem is the same.
Sergei Golubchik
added a comment - If you've tried different variants of your table definition and they delivered unexpected results — please, mention everything that you tried here. Perhaps that all also can be fixed as a part of this bug, if the underlying code problem is the same.
After more than 1.5 years I tried it again. Until now I worked without bi-temp tables that's why it was not so urgent. Last week I startet to implement bi-temp in a Laravel package. That's how I remembered my bug report.
I investigated a little bit more. So SYSTEM VERSION based tables are working fine and the autoincrement id is not incremented.
The problem is with APPLICATION PERIOD based tables or bi-temp only. I saw that your examples and those on the internet only work without autoincrement keys. (see KB MariaDB Application-Time Periods)
So I create a simple table with two dates defined as PERIOD FOR. see:
CREATEORREPLACETABLE `t1` (
`id` bigint(20) unsigned NOTNULL AUTO_INCREMENT,
`price_for` varchar(100) DEFAULTNULL,
`price` integerDEFAULTNULL,
`date_1` dateNOTNULLDEFAULT NOW(),
`date_2` dateNOTNULLDEFAULT'2038-01-19',
PRIMARYKEY (`id`, `date_1`),
PERIOD FOR `app_p` (`date_1`, `date_2`)
);
Remark: the column 'id' is AUTOINCREMENT and part of the PK.
In this step all is OK and the key is set automatically to the value 1.
Then I add a new record using PORTION OF like
update t1 FOR Portion of app_p
from '2023-06-11' TO '2038-01-19'
set price = 1200
where id=1;
A "normal" update would never change the id (PK). The result what I get is:
select * from t1
id|price_for|price|date_1 |date_2 |
--+---------+-----+----------+----------+
1|car | 1200|2023-06-11|2038-01-19|
2|car | 1000|2023-06-10|2023-06-11|
2 row(s) fetched.
It is clear that minimum one of these two must be in the PK. This is ensured. It seems that MariaDB is updated the "new" record correctly but the "old" one is inserted as a complete new record. In this case autoincrement must be avoided. Autoincrementing the "id" must be allowed for INSERT not for UPDATE. Strangely enough this works with System Versioned columns.
Unique WITHOUT OVERLAPS has no influence on this behaviour.
Conclusion: If a developer relies on UUID keys or self-generated increment keys, then Application-Time Period works, otherwise not.
Thank you for your investigation and correction.
Enrica
Enrica Ruedin
added a comment - Hi Sergei
After more than 1.5 years I tried it again. Until now I worked without bi-temp tables that's why it was not so urgent. Last week I startet to implement bi-temp in a Laravel package. That's how I remembered my bug report.
I investigated a little bit more. So SYSTEM VERSION based tables are working fine and the autoincrement id is not incremented.
The problem is with APPLICATION PERIOD based tables or bi-temp only. I saw that your examples and those on the internet only work without autoincrement keys. (see KB MariaDB Application-Time Periods )
So I create a simple table with two dates defined as PERIOD FOR. see:
CREATE OR REPLACE TABLE `t1` (
`id` bigint (20) unsigned NOT NULL AUTO_INCREMENT,
`price_for` varchar (100) DEFAULT NULL ,
`price` integer DEFAULT NULL ,
`date_1` date NOT NULL DEFAULT NOW(),
`date_2` date NOT NULL DEFAULT '2038-01-19' ,
PRIMARY KEY (`id`, `date_1`),
PERIOD FOR `app_p` (`date_1`, `date_2`)
);
Remark: the column 'id' is AUTOINCREMENT and part of the PK.
I add a record. The result is
insert INTO t1 (price_for, price) VALUES ( 'car' , 1000)
1 row(s) modified.
Data truncated for column 'date_1' at row 1
> select * from t1
id|price_for|price|date_1 |date_2 |
--+---------+-----+----------+----------+
1|car | 1000|2023-06-10|2038-01-19|
In this step all is OK and the key is set automatically to the value 1.
Then I add a new record using PORTION OF like
update t1 FOR Portion of app_p
from '2023-06-11' TO '2038-01-19'
set price = 1200
where id= 1 ;
A "normal" update would never change the id (PK). The result what I get is:
select * from t1
id|price_for|price|date_1 |date_2 |
--+---------+-----+----------+----------+
1|car | 1200|2023-06-11|2038-01-19|
2|car | 1000|2023-06-10|2023-06-11|
2 row(s) fetched.
It is clear that minimum one of these two must be in the PK. This is ensured. It seems that MariaDB is updated the "new" record correctly but the "old" one is inserted as a complete new record. In this case autoincrement must be avoided. Autoincrementing the "id" must be allowed for INSERT not for UPDATE. Strangely enough this works with System Versioned columns.
Unique WITHOUT OVERLAPS has no influence on this behaviour.
Conclusion: If a developer relies on UUID keys or self-generated increment keys, then Application-Time Period works, otherwise not.
Thank you for your investigation and correction.
Enrica
Well, the standard says that a corresponding internally generated INSERT statement should contain a DEFAULT value for the identity column (that is, AUTO_INCREMENT in our implementation).
Though I agree: for this case, as well as the case like PRIMARY KEY(id, p WITHOUT OVERLAPS) this behavior is not usable.
So, at least for this feature, let's treat identity column as AUTO_INCREMENT, but only if it's a part of a unique key, that does not feature WITHOUT OVERLAPS
Nikita Malyavin
added a comment - - edited Well, the standard says that a corresponding internally generated INSERT statement should contain a DEFAULT value for the identity column (that is, AUTO_INCREMENT in our implementation).
Though I agree: for this case, as well as the case like PRIMARY KEY(id, p WITHOUT OVERLAPS) this behavior is not usable.
So, at least for this feature, let's treat identity column as AUTO_INCREMENT , but only if it's a part of a unique key, that does not feature WITHOUT OVERLAPS
I have made more research around identity column in the standard: it makes no exceptions to the usage. Even for WITHOUT OVERLAPS. And for the rest of the queries the same field would have an effect of identity column.
I still agree that it's simply not practically usable this way, just I can't call it our reading, but a step aside.
After looking at some configurations I decided that the best way is to make an exception for all AUTO_INCREMENT columns that are part of keys, that include WITHOUT OVERLAPS. No matter if it's part of any other keys or not.
Pros:
The rule is natural and simple to remember.
It resolves the confusion with usage
It resolves creepy scenarios like unique(id, other), unique(id, p without overlaps) – in this case id will not increment and cause duplicated key error on implicit insert.
Cons:
Disobeys standard
The case for unique(id, start_date) will conserve the behavior as it's now
serg please review my approach and the solution on branch bb-10.5-periods-autoinc.
I have also refactored class key_map to be able to be used in range-based loops, will be glad if you'll review it at once
Nikita Malyavin
added a comment - - edited I have made more research around identity column in the standard: it makes no exceptions to the usage. Even for WITHOUT OVERLAPS. And for the rest of the queries the same field would have an effect of identity column.
I still agree that it's simply not practically usable this way, just I can't call it our reading, but a step aside.
After looking at some configurations I decided that the best way is to make an exception for all AUTO_INCREMENT columns that are part of keys, that include WITHOUT OVERLAPS. No matter if it's part of any other keys or not.
Pros:
The rule is natural and simple to remember.
It resolves the confusion with usage
It resolves creepy scenarios like unique(id, other), unique(id, p without overlaps) – in this case id will not increment and cause duplicated key error on implicit insert.
Cons:
Disobeys standard
The case for unique(id, start_date) will conserve the behavior as it's now
serg please review my approach and the solution on branch bb-10.5-periods-autoinc .
I have also refactored class key_map to be able to be used in range-based loops, will be glad if you'll review it at once
You're right that the standard says "statement should contain a DEFAULT value for the identity column", for the reference, it's in SQL:2016, Part 2, 15.13 Effect of replacing rows in base tables, paragraph 10) b) i)
But there's no concept of "IDENTITY WITHOUT OVERLAPS" in the standard, this is our extension and we can define its behavior as we want. So, fine, let's try to make it to keep old values when a row is split.
Sergei Golubchik
added a comment - You're right that the standard says "statement should contain a DEFAULT value for the identity column", for the reference, it's in SQL:2016, Part 2, 15.13 Effect of replacing rows in base tables, paragraph 10) b) i)
But there's no concept of "IDENTITY WITHOUT OVERLAPS" in the standard, this is our extension and we can define its behavior as we want. So, fine, let's try to make it to keep old values when a row is split.
It's not IDENTITY WITHOUT OVERLAPS. It's IDENTITY, <period> WITHOUT OVERLAPS. And IDENTITY is not forbidden to be a part of such a key anywhere.
Nikita Malyavin
added a comment - It's not IDENTITY WITHOUT OVERLAPS. It's IDENTITY, <period> WITHOUT OVERLAPS . And IDENTITY is not forbidden to be a part of such a key anywhere.
IDENTITY can be part of the key, and the key can be WITHOUT OVERLAPS. But this key doesn't determine how values are generated, it's simply a constraint. While values are generated by a completely separate sequence generator which knows nothing about the period and its overlaps.
So, above I meant that there's no IDENTITY column in the standard that can be declared to know about overlaps. There can be a totally unrelated unique constraint WITHOUT OVERLAPS though.
Sergei Golubchik
added a comment - IDENTITY can be part of the key, and the key can be WITHOUT OVERLAPS . But this key doesn't determine how values are generated, it's simply a constraint. While values are generated by a completely separate sequence generator which knows nothing about the period and its overlaps.
So, above I meant that there's no IDENTITY column in the standard that can be declared to know about overlaps. There can be a totally unrelated unique constraint WITHOUT OVERLAPS though.
Fix pushed to the 10.5 branch. The new behavior will be documented in scope of MDEV-33346. Would be good to make it blocker, so that we'd definitely have it updated to the release day.
Nikita Malyavin
added a comment - Fix pushed to the 10.5 branch. The new behavior will be documented in scope of MDEV-33346 . Would be good to make it blocker, so that we'd definitely have it updated to the release day.
I'd expect the correct key definition should be
but this doesn't help, new auto-inc value is still generated.