Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31631

Online ALTER adding auto-increment column to a table with history behaves differently from non-online

Details

    Description

      In existing MariaDB versions adding an auto-increment column to a system-versioned table which already has history is prohibited:

      MariaDB [test]> create or replace table t (a int) with system versioning;
      Query OK, 0 rows affected (0.054 sec)
       
      MariaDB [test]> insert into t values (1);
      Query OK, 1 row affected (0.013 sec)
       
      MariaDB [test]> update t set a = 2;
      Query OK, 1 row affected (0.006 sec)
      Rows matched: 1  Changed: 1  Inserted: 1  Warnings: 0
       
      MariaDB [test]> set system_versioning_alter_history= KEEP;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> alter table t add b int auto_increment, add key(b);
      ERROR 1112 (42000): Table 'test/t' uses an extension that doesn't exist in this MariaDB version
      

      However, if UPDATE creating the history is performed in parallel with online ALTER, the result is different – the auto-increment column is created and populated with generated values for current rows, and with zeros for historical rows. Thus, if it is created with a unique key on it, ALTER fails with duplicate key error, and if it the key is non-unique, ALTER succeeds, both of which is different from what happens with non-online / non-concurrent ALTER.

      --source include/have_debug_sync.inc
       
      create table t (a int) with system versioning;
      insert into t values (1),(2);
      set system_versioning_alter_history= KEEP;
      set debug_sync= 'alter_table_online_progress WAIT_FOR go';
      --send
        alter table t add pk int auto_increment primary key, algorithm=copy, lock=none;
       
      --connect (con_dml,localhost,root,,test)
       
      update t set a = a + 10;
      set debug_sync= 'now signal go';
       
      --connection default
      --error ER_UNSUPPORTED_EXTENSION
      --reap
       
      # Cleanup
      drop table t;
      set debug_sync= reset;
      --disconnect con_dml
      

      Result with the test case above as is (primary key):

      bb-11.2-oalter adcf5dfa8d6, unique key

      mysqltest: At line 17: query 'reap' failed with wrong errno ER_DUP_ENTRY (1062): 'Duplicate entry '0' for key 'PRIMARY'', instead of ER_UNSUPPORTED_EXTENSION (1112)...
      

      Same test case but with non-unique key instead of PK:

      mysqltest: At line 17: query 'reap' succeeded - should have failed with error ER_UNSUPPORTED_EXTENSION (1112)...
      

      and the resulting table contents is

      select *, row_start, row_end from t for system_time all;
      a	pk	row_start	row_end
      11	1	2023-07-05 17:13:12.545737	2038-01-19 05:14:07.999999
      12	2	2023-07-05 17:13:12.545737	2038-01-19 05:14:07.999999
      1	0	2023-07-05 17:13:12.541333	2023-07-05 17:13:12.545737
      2	0	2023-07-05 17:13:12.541333	2023-07-05 17:13:12.545737
      

      Attachments

        Issue Links

          Activity

            elenst so far I have forbidden both, however with a few rules, which follow.

            But first let me address the non-deterministic question:
            i thought something is forbidden regarding the determinism of DEFAULT because I remember I wrote some detection. But that detection was to allow new fields with deterministic defaults be a part of a key when searching for a row to update or to delete.

            Nothing was forbidden regarding the defaults. Only MDEV-31058 added som restrictions to the columns.

            Now I am not forbidding non-deterministic DEFAULTs as well: only ADD...AUTO_INCREMENT and DEFAULT(nextval(..)) . The non-deterministic columns are not expected to break the O(A)=A(O) rule in general. If you find something - you can report.

            For example if the function changes its results from call to call with the same arguments.
            Like: times_called(), returninn 0,1,2,...

            An interesting example here is RAND(), which formally breaks this rule, but the random function is not a concern.

            So the rule is tricky and case-by-case I suppose.

            nikitamalyavin Nikita Malyavin added a comment - elenst so far I have forbidden both, however with a few rules, which follow. But first let me address the non-deterministic question: i thought something is forbidden regarding the determinism of DEFAULT because I remember I wrote some detection. But that detection was to allow new fields with deterministic defaults be a part of a key when searching for a row to update or to delete. Nothing was forbidden regarding the defaults. Only MDEV-31058 added som restrictions to the columns. Now I am not forbidding non-deterministic DEFAULTs as well: only ADD...AUTO_INCREMENT and DEFAULT(nextval(..)) . The non-deterministic columns are not expected to break the O(A)=A(O) rule in general. If you find something - you can report. For example if the function changes its results from call to call with the same arguments. Like: times_called(), returninn 0,1,2,... An interesting example here is RAND(), which formally breaks this rule, but the random function is not a concern. So the rule is tricky and case-by-case I suppose.

            With RAND() we can surely get the cases when the rule "as if DML was executed first" is violated, but RAND() behavior is so obscure, even when it's put into the deterministic context, that I wouldn't be able to even formulate a complaint.
            For example, it works differently depending whether instant or copy algorithm is used (regardless whether it's "online" or not).
            So, I'm going to ignore RAND() unless somebody can specify how exactly it is expected to behave.

            elenst Elena Stepanova added a comment - With RAND() we can surely get the cases when the rule "as if DML was executed first" is violated, but RAND() behavior is so obscure, even when it's put into the deterministic context, that I wouldn't be able to even formulate a complaint. For example, it works differently depending whether instant or copy algorithm is used (regardless whether it's "online" or not). So, I'm going to ignore RAND() unless somebody can specify how exactly it is expected to behave.
            nikitamalyavin Nikita Malyavin added a comment - - edited

            Let me try: unless we don't care of an exact value, but rather a property, which is preserved throughout the statement

            But we don't want this complication yes, let's just ignore random functions for now and see what else can we find

            nikitamalyavin Nikita Malyavin added a comment - - edited Let me try: unless we don't care of an exact value, but rather a property, which is preserved throughout the statement But we don't want this complication yes, let's just ignore random functions for now and see what else can we find

            Especially since it is not even true for RAND(). When I say it behaves differently with COPY and NOCOPY, I don't mean that it returns different values. It exactly doesn't "preserve the property", if by that we mean that the value is expected to be (generally) different for each row.

            MariaDB [test]> alter table t add b double(10,9) default(rand(3)), algorithm=copy, lock=shared;
            Query OK, 3 rows affected (0.055 sec)              
            Records: 3  Duplicates: 0  Warnings: 0
             
            MariaDB [test]> select * from t;
            +------+-------------+
            | a    | b           |
            +------+-------------+
            |    1 | 0.905769756 |
            |    2 | 0.373079058 |
            |    3 | 0.148086053 |
            +------+-------------+
            3 rows in set (0.014 sec)
            

            That's what we would probably expect as a "property" (and if we were optimistic enough, we would even expect the sequence of values preserved, as that's what the argument to RAND is for). But

            MariaDB [test]> alter table t add b double(10,9) default(rand(3)), algorithm=instant;
            Query OK, 0 rows affected (0.051 sec)
            Records: 0  Duplicates: 0  Warnings: 0
             
            MariaDB [test]> select * from t;
            +------+-------------+
            | a    | b           |
            +------+-------------+
            |    1 | 0.373079058 |
            |    2 | 0.373079058 |
            |    3 | 0.373079058 |
            +------+-------------+
            3 rows in set (0.002 sec)
            

            that's what we probably wouldn't immediately expect.

            elenst Elena Stepanova added a comment - Especially since it is not even true for RAND(). When I say it behaves differently with COPY and NOCOPY, I don't mean that it returns different values. It exactly doesn't "preserve the property", if by that we mean that the value is expected to be (generally) different for each row. MariaDB [test]> alter table t add b double (10,9) default (rand(3)), algorithm=copy, lock=shared; Query OK, 3 rows affected (0.055 sec) Records: 3 Duplicates: 0 Warnings: 0   MariaDB [test]> select * from t; + ------+-------------+ | a | b | + ------+-------------+ | 1 | 0.905769756 | | 2 | 0.373079058 | | 3 | 0.148086053 | + ------+-------------+ 3 rows in set (0.014 sec) That's what we would probably expect as a "property" (and if we were optimistic enough, we would even expect the sequence of values preserved, as that's what the argument to RAND is for). But MariaDB [test]> alter table t add b double (10,9) default (rand(3)), algorithm=instant; Query OK, 0 rows affected (0.051 sec) Records: 0 Duplicates: 0 Warnings: 0   MariaDB [test]> select * from t; + ------+-------------+ | a | b | + ------+-------------+ | 1 | 0.373079058 | | 2 | 0.373079058 | | 3 | 0.373079058 | + ------+-------------+ 3 rows in set (0.002 sec) that's what we probably wouldn't immediately expect.

            wow! i think this is an INSTANT bug. Neither i would expect it, nor want as a user. Nice find.

            nikitamalyavin Nikita Malyavin added a comment - wow! i think this is an INSTANT bug. Neither i would expect it, nor want as a user. Nice find.

            People

              nikitamalyavin Nikita Malyavin
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.