[MDEV-31631] Online ALTER adding auto-increment column to a table with history behaves differently from non-online Created: 2023-07-05 Updated: 2023-08-16 Resolved: 2023-08-16 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Alter Table, Versioned Tables |
| Affects Version/s: | N/A |
| Fix Version/s: | 11.2.1 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Elena Stepanova | Assignee: | Nikita Malyavin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
In existing MariaDB versions adding an auto-increment column to a system-versioned table which already has history is prohibited:
However, if UPDATE creating the history is performed in parallel with online ALTER, the result is different – the auto-increment column is created and populated with generated values for current rows, and with zeros for historical rows. Thus, if it is created with a unique key on it, ALTER fails with duplicate key error, and if it the key is non-unique, ALTER succeeds, both of which is different from what happens with non-online / non-concurrent ALTER.
Result with the test case above as is (primary key):
Same test case but with non-unique key instead of PK:
and the resulting table contents is
|
| Comments |
| Comment by Sergei Golubchik [ 2023-07-12 ] | ||||||||||||||||||||||||||
|
can we generally support adding auto-increment column online? It seems that because of its global nature (auto-increment column values depend on other rows in the table) it cannot be possibly be added online. Consider:
and then in parallel
You'll get different results depending on whether auto-increment values were generated before or after DELETE. Looks like a fundamental unsolvable auto-increment issue, doesn't it? | ||||||||||||||||||||||||||
| Comment by Nikita Malyavin [ 2023-07-13 ] | ||||||||||||||||||||||||||
|
Seems you are right, serg, since we stick to taking the effect after the parallel DMLs. Moreover, it means that we only can support the commutative pairs, i.e.
where
Though this also means that we can support adding autoincrement field for some subset, like inserts (and maybe updates?) Or we could just allow Online ALTER to take after the DMLs | ||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2023-07-16 ] | ||||||||||||||||||||||||||
Another example of the same problem would be adding a column with a sequence value for a default
and then in parallel
| ||||||||||||||||||||||||||
| Comment by Nikita Malyavin [ 2023-07-19 ] | ||||||||||||||||||||||||||
|
Thank you, elenst! Yes, nextval also shouldn't be possible. I'm surprised that it is classified as deterministic. | ||||||||||||||||||||||||||
| Comment by Nikita Malyavin [ 2023-07-27 ] | ||||||||||||||||||||||||||
|
elenst so far I have forbidden both, however with a few rules, which follow. But first let me address the non-deterministic question: Nothing was forbidden regarding the defaults. Only Now I am not forbidding non-deterministic DEFAULTs as well: only ADD...AUTO_INCREMENT and DEFAULT(nextval(..)) . The non-deterministic columns are not expected to break the O(A)=A(O) rule in general. If you find something - you can report. For example if the function changes its results from call to call with the same arguments. An interesting example here is RAND(), which formally breaks this rule, but the random function is not a concern. So the rule is tricky and case-by-case I suppose. | ||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2023-07-27 ] | ||||||||||||||||||||||||||
|
With RAND() we can surely get the cases when the rule "as if DML was executed first" is violated, but RAND() behavior is so obscure, even when it's put into the deterministic context, that I wouldn't be able to even formulate a complaint. | ||||||||||||||||||||||||||
| Comment by Nikita Malyavin [ 2023-07-27 ] | ||||||||||||||||||||||||||
|
Let me try: unless we don't care of an exact value, but rather a property, which is preserved throughout the statement But we don't want this complication | ||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2023-07-27 ] | ||||||||||||||||||||||||||
|
Especially since it is not even true for RAND(). When I say it behaves differently with COPY and NOCOPY, I don't mean that it returns different values. It exactly doesn't "preserve the property", if by that we mean that the value is expected to be (generally) different for each row.
That's what we would probably expect as a "property" (and if we were optimistic enough, we would even expect the sequence of values preserved, as that's what the argument to RAND is for). But
that's what we probably wouldn't immediately expect. | ||||||||||||||||||||||||||
| Comment by Nikita Malyavin [ 2023-07-27 ] | ||||||||||||||||||||||||||
|
wow! i think this is an INSTANT bug. Neither i would expect it, nor want as a user. Nice find. |