Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13094

SHOW CREATE TABLE can report non-persistent AUTO_INCREMENT value before server restart

Details

    Description

      --source include/have_innodb.inc
       
      create table t1 (pk int auto_increment primary key, i int, unique (i)) engine=InnoDB;
      insert into t1 (i) values (1),(2),(3);
      insert ignore into t1 (pk, i) values (100,1);
      --echo #
      --echo # Table before restart
      show create table t1;
      --source include/restart_mysqld.inc
      --echo #
      --echo # Table after restart
      show create table t1;
       
      drop table t1;
      

      # Table before restart
      show create table t1;
      Table	Create Table
      t1	CREATE TABLE `t1` (
        `pk` int(11) NOT NULL AUTO_INCREMENT,
        `i` int(11) DEFAULT NULL,
        PRIMARY KEY (`pk`),
        UNIQUE KEY `i` (`i`)
      ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1
      #
      # Table after restart
      show create table t1;
      Table	Create Table
      t1	CREATE TABLE `t1` (
        `pk` int(11) NOT NULL AUTO_INCREMENT,
        `i` int(11) DEFAULT NULL,
        PRIMARY KEY (`pk`),
        UNIQUE KEY `i` (`i`)
      ) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=latin1
      

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova created issue -

            I have to disable the check for auto-increment in upgrade tests with 10.2+ old server for now. Need to re-enable it after the bug is fixed.

            elenst Elena Stepanova added a comment - I have to disable the check for auto-increment in upgrade tests with 10.2+ old server for now. Need to re-enable it after the bug is fixed.

            MDEV-6076 introduced persistent AUTO_INCREMENT for InnoDB, and here it seems to working as designed.

            The issue here is that before restart, SHOW CREATE TABLE is not displaying the next value that would be assigned. After restart, it is displayed.

            marko Marko Mäkelä added a comment - MDEV-6076 introduced persistent AUTO_INCREMENT for InnoDB, and here it seems to working as designed. The issue here is that before restart, SHOW CREATE TABLE is not displaying the next value that would be assigned. After restart, it is displayed.
            marko Marko Mäkelä made changes -
            Field Original Value New Value
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            marko Marko Mäkelä made changes -
            Summary Wrong AUTO_INCREMENT value on the table after server restart SHOW CREATE TABLE reports wrong AUTO_INCREMENT value before server restart
            marko Marko Mäkelä made changes -
            Affects Version/s 10.2.4 [ 22116 ]
            Affects Version/s 10.2 [ 14601 ]

            marko, have you noticed that the second INSERT IGNORE does not actually insert anything?
            If it inserted 100, then indeed, the new AUTO_INCREMENT value on the table should have been 101. But it's ignored due to the unique key, so the contents of pk remains being (1),(2),(3). Also, LAST_INSERT_ID still has 1, and the next value that would be inserted is 4, not 101:

            MariaDB [test]> create table t1 (pk int auto_increment primary key, i int, unique (i)) engine=InnoDB;
            Query OK, 0 rows affected (0.25 sec)
             
            MariaDB [test]> insert into t1 (i) values (1),(2),(3);
            Query OK, 3 rows affected (0.04 sec)
            Records: 3  Duplicates: 0  Warnings: 0
             
            MariaDB [test]> insert ignore into t1 (pk, i) values (100,1);
            Query OK, 0 rows affected, 1 warning (0.05 sec)
             
            MariaDB [test]> show warnings;
            +---------+------+---------------------------------+
            | Level   | Code | Message                         |
            +---------+------+---------------------------------+
            | Warning | 1062 | Duplicate entry '1' for key 'i' |
            +---------+------+---------------------------------+
            1 row in set (0.00 sec)
            

            MariaDB [test]> select last_insert_id();
            +------------------+
            | last_insert_id() |
            +------------------+
            |                1 |
            +------------------+
            1 row in set (0.00 sec)
             
            MariaDB [test]> insert into t1 (i) values (4);
            Query OK, 1 row affected (0.04 sec)
             
            MariaDB [test]> select * from t1;
            +----+------+
            | pk | i    |
            +----+------+
            |  1 |    1 |
            |  2 |    2 |
            |  3 |    3 |
            |  4 |    4 |
            +----+------+
            4 rows in set (0.00 sec)
            

            elenst Elena Stepanova added a comment - marko , have you noticed that the second INSERT IGNORE does not actually insert anything? If it inserted 100, then indeed, the new AUTO_INCREMENT value on the table should have been 101. But it's ignored due to the unique key, so the contents of pk remains being (1),(2),(3). Also, LAST_INSERT_ID still has 1 , and the next value that would be inserted is 4, not 101: MariaDB [test]> create table t1 (pk int auto_increment primary key , i int , unique (i)) engine=InnoDB; Query OK, 0 rows affected (0.25 sec)   MariaDB [test]> insert into t1 (i) values (1),(2),(3); Query OK, 3 rows affected (0.04 sec) Records: 3 Duplicates: 0 Warnings: 0   MariaDB [test]> insert ignore into t1 (pk, i) values (100,1); Query OK, 0 rows affected, 1 warning (0.05 sec)   MariaDB [test]> show warnings; + ---------+------+---------------------------------+ | Level | Code | Message | + ---------+------+---------------------------------+ | Warning | 1062 | Duplicate entry '1' for key 'i' | + ---------+------+---------------------------------+ 1 row in set (0.00 sec) MariaDB [test]> select last_insert_id(); + ------------------+ | last_insert_id() | + ------------------+ | 1 | + ------------------+ 1 row in set (0.00 sec)   MariaDB [test]> insert into t1 (i) values (4); Query OK, 1 row affected (0.04 sec)   MariaDB [test]> select * from t1; + ----+------+ | pk | i | + ----+------+ | 1 | 1 | | 2 | 2 | | 3 | 3 | | 4 | 4 | + ----+------+ 4 rows in set (0.00 sec)

            elenst, this is somewhat tricky.
            The persistent InnoDB AUTO_INCREMENT value that was implemented in MDEV-6076 is not transactional; it is only crash-safe.

            If a transaction or a part of it is rolled back, the persistent AUTO_INCREMENT counter will not be rolled back. When the INSERT IGNORE attempts to insert the record (pk,i)=(100,1) a duplicate key i=1 into the unique secondary index, it will already have performed the following steps:

            1. Acquire a record lock on the PRIMARY KEY (pk=100).
            2. Write undo log for rolling back the insert (100,1).
            3. Update the persistent AUTO_INCREMENT value to 101, and insert the record (100,1) into the PRIMARY KEY.
            4. Acquire a record lock on the UNIQUE KEY (i=1).
            5. Try to insert the record (i,pk)=(1,100) into the unique index. This will fail due to the pre-existing key (i,pk)=(1,1).
            6. To resolve the duplicate key error, roll back the latest row from the undo log. (This is a no-op for the UNIQUE KEY, because no record (i,pk)=(1,100) exists there.)

            The mini-transaction that inserted the record (100,1) into the PRIMARY KEY was committed. If you want to avoid observing a change to the persistent AUTO_INCREMENT sequence, the only solution would be to kill the server before the INSERT IGNORE transaction was committed, and hope that the server did not flush the redo log for example because a transaction in another connection ended, or because InnoDB decided to perform a redo log checkpoint.

            Basically the main guarantee that we give is that if a transaction was persistently committed, changes to the auto-increment counter will be persisted as well, even if the server was killed and restarted after the transaction commit.

            Note that in InnoDB, a rollback includes a commit. First, InnoDB would apply the undo log of the transaction backwards, basically emptying the set of modifications made by the transaction, and then it would commit that empty transaction. The innodb_flush_log_at_trx_commit setting affects the final commit step of rollback too.

            Because there is no undo logging for the persistent AUTO_INCREMENT sequence, the AUTO_INCREMENT will stay where it was left, even if there was a partial or full rollback of the transaction.

            If we changed the LAST_INSERT_ID() to reflect the last reserved AUTO_INCREMENT value instead of the last inserted value, a lot of things would break.

            In my opinion, the only thinkable change here might be to change SHOW CREATE TABLE so that it will report the actual persisted AUTO_INCREMENT value, with the disclaimer that if the server is killed and the redo log was not yet flushed up to the point where the persistent AUTO_INCREMENT was last changed, you could see a different value after restart.

            marko Marko Mäkelä added a comment - elenst , this is somewhat tricky. The persistent InnoDB AUTO_INCREMENT value that was implemented in MDEV-6076 is not transactional; it is only crash-safe. If a transaction or a part of it is rolled back, the persistent AUTO_INCREMENT counter will not be rolled back. When the INSERT IGNORE attempts to insert the record (pk,i)=(100,1) a duplicate key i=1 into the unique secondary index, it will already have performed the following steps: Acquire a record lock on the PRIMARY KEY (pk=100). Write undo log for rolling back the insert (100,1). Update the persistent AUTO_INCREMENT value to 101, and insert the record (100,1) into the PRIMARY KEY. Acquire a record lock on the UNIQUE KEY (i=1). Try to insert the record (i,pk)=(1,100) into the unique index. This will fail due to the pre-existing key (i,pk)=(1,1). To resolve the duplicate key error, roll back the latest row from the undo log. (This is a no-op for the UNIQUE KEY, because no record (i,pk)=(1,100) exists there.) The mini-transaction that inserted the record (100,1) into the PRIMARY KEY was committed. If you want to avoid observing a change to the persistent AUTO_INCREMENT sequence, the only solution would be to kill the server before the INSERT IGNORE transaction was committed, and hope that the server did not flush the redo log for example because a transaction in another connection ended, or because InnoDB decided to perform a redo log checkpoint. Basically the main guarantee that we give is that if a transaction was persistently committed, changes to the auto-increment counter will be persisted as well, even if the server was killed and restarted after the transaction commit. Note that in InnoDB, a rollback includes a commit. First, InnoDB would apply the undo log of the transaction backwards, basically emptying the set of modifications made by the transaction, and then it would commit that empty transaction. The innodb_flush_log_at_trx_commit setting affects the final commit step of rollback too. Because there is no undo logging for the persistent AUTO_INCREMENT sequence, the AUTO_INCREMENT will stay where it was left, even if there was a partial or full rollback of the transaction. If we changed the LAST_INSERT_ID() to reflect the last reserved AUTO_INCREMENT value instead of the last inserted value, a lot of things would break. In my opinion, the only thinkable change here might be to change SHOW CREATE TABLE so that it will report the actual persisted AUTO_INCREMENT value, with the disclaimer that if the server is killed and the redo log was not yet flushed up to the point where the persistent AUTO_INCREMENT was last changed, you could see a different value after restart.

            I tested the following patch. If we decide to go this way, it would require some more work, because the persisted AUTO_INCREMENT value is the last assigned value, while SHOW CREATE TABLE should display the next usable value. At the very least, we should take auto_increment_increment and auto_increment_offset into account, similar to how ha_innobase::get_auto_increment() works.

            diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
            index 6be10d77edf..0252dda1ac2 100644
            --- a/storage/innobase/handler/ha_innodb.cc
            +++ b/storage/innobase/handler/ha_innodb.cc
            @@ -12478,15 +12478,22 @@ ha_innobase::update_create_info(
             /*============================*/
             	HA_CREATE_INFO*	create_info)	/*!< in/out: create info */
             {
            -	if (!(create_info->used_fields & HA_CREATE_USED_AUTO)) {
            -		info(HA_STATUS_AUTO);
            -		create_info->auto_increment_value = stats.auto_increment_value;
            -	}
            -
             	if (dict_table_is_temporary(m_prebuilt->table)) {
            +		if (!(create_info->used_fields & HA_CREATE_USED_AUTO)
            +		    && table->found_next_number_field) {
            +			create_info->auto_increment_value
            +				= innobase_peek_autoinc();
            +		}
             		return;
             	}
             
            +	if (!(create_info->used_fields & HA_CREATE_USED_AUTO)
            +	    && table->found_next_number_field
            +	    && m_prebuilt->table->is_readable()) {
            +		create_info->auto_increment_value = btr_read_autoinc(
            +			m_prebuilt->table->indexes.start);
            +	}
            +
             	/* Update the DATA DIRECTORY name from SYS_DATAFILES. */
             	dict_get_and_save_data_dir_path(m_prebuilt->table, false);
             
            diff --git a/mysql-test/suite/innodb/t/autoinc_persist.test b/mysql-test/suite/innodb/t/autoinc_persist.test
            index fd85b45fbfa..d2b34d1f32d 100644
            --- a/mysql-test/suite/innodb/t/autoinc_persist.test
            +++ b/mysql-test/suite/innodb/t/autoinc_persist.test
            @@ -108,9 +108,75 @@ SELECT MAX(a) AS `Expect 100000000000` FROM t9;
             CREATE TABLE t13(a INT AUTO_INCREMENT PRIMARY KEY) ENGINE = InnoDB,
             AUTO_INCREMENT = 1234;
             
            +CREATE TABLE t14 (pk SERIAL, i INT UNIQUE) ENGINE = InnoDB;
            +INSERT INTO t14 (i) values (1),(2),(3);
            +INSERT IGNORE INTO t14 (pk, i) values (100,1);
            +SELECT LAST_INSERT_ID();
            +INSERT INTO t14 SET i=4;
            +
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t1;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t2;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t3;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t4;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t5;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t6;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t7;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t8;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t9;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t10;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t11;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t12;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t13;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t14;
            +
             --source include/restart_mysqld.inc
             
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t1;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t2;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t3;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t4;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t5;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t6;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t7;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t8;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t9;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t10;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t11;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t12;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
             SHOW CREATE TABLE t13;
            +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/;
            +SHOW CREATE TABLE t14;
            +INSERT INTO t14 SET i=5;
            +SELECT * FROM t14;
            +DROP TABLE t14;
            +
             INSERT INTO t13 VALUES(0);
             SELECT a AS `Expect 1234` FROM t13;
             DROP TABLE t13;
            

            marko Marko Mäkelä added a comment - I tested the following patch. If we decide to go this way, it would require some more work, because the persisted AUTO_INCREMENT value is the last assigned value, while SHOW CREATE TABLE should display the next usable value. At the very least, we should take auto_increment_increment and auto_increment_offset into account, similar to how ha_innobase::get_auto_increment() works. diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc index 6be10d77edf..0252dda1ac2 100644 --- a/storage/innobase/handler/ha_innodb.cc +++ b/storage/innobase/handler/ha_innodb.cc @@ -12478,15 +12478,22 @@ ha_innobase::update_create_info( /*============================*/ HA_CREATE_INFO* create_info) /*!< in/out: create info */ { - if (!(create_info->used_fields & HA_CREATE_USED_AUTO)) { - info(HA_STATUS_AUTO); - create_info->auto_increment_value = stats.auto_increment_value; - } - if (dict_table_is_temporary(m_prebuilt->table)) { + if (!(create_info->used_fields & HA_CREATE_USED_AUTO) + && table->found_next_number_field) { + create_info->auto_increment_value + = innobase_peek_autoinc(); + } return; } + if (!(create_info->used_fields & HA_CREATE_USED_AUTO) + && table->found_next_number_field + && m_prebuilt->table->is_readable()) { + create_info->auto_increment_value = btr_read_autoinc( + m_prebuilt->table->indexes.start); + } + /* Update the DATA DIRECTORY name from SYS_DATAFILES. */ dict_get_and_save_data_dir_path(m_prebuilt->table, false); diff --git a/mysql-test/suite/innodb/t/autoinc_persist.test b/mysql-test/suite/innodb/t/autoinc_persist.test index fd85b45fbfa..d2b34d1f32d 100644 --- a/mysql-test/suite/innodb/t/autoinc_persist.test +++ b/mysql-test/suite/innodb/t/autoinc_persist.test @@ -108,9 +108,75 @@ SELECT MAX(a) AS `Expect 100000000000` FROM t9; CREATE TABLE t13(a INT AUTO_INCREMENT PRIMARY KEY) ENGINE = InnoDB, AUTO_INCREMENT = 1234; +CREATE TABLE t14 (pk SERIAL, i INT UNIQUE) ENGINE = InnoDB; +INSERT INTO t14 (i) values (1),(2),(3); +INSERT IGNORE INTO t14 (pk, i) values (100,1); +SELECT LAST_INSERT_ID(); +INSERT INTO t14 SET i=4; + +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t1; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t2; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t3; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t4; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t5; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t6; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t7; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t8; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t9; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t10; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t11; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t12; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t13; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t14; + --source include/restart_mysqld.inc +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t1; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t2; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t3; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t4; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t5; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t6; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t7; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t8; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t9; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t10; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t11; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t12; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; SHOW CREATE TABLE t13; +replace_regex /.*(AUTO_INCREMENT=[0-9]+) .*/\1/; +SHOW CREATE TABLE t14; +INSERT INTO t14 SET i=5; +SELECT * FROM t14; +DROP TABLE t14; + INSERT INTO t13 VALUES(0); SELECT a AS `Expect 1234` FROM t13; DROP TABLE t13;
            marko Marko Mäkelä made changes -
            Summary SHOW CREATE TABLE reports wrong AUTO_INCREMENT value before server restart SHOW CREATE TABLE can report non-persistent AUTO_INCREMENT value before server restart

            Some further thoughts:

            1. Users probably expect LAST_INSERT_ID() to return the last successfully used AUTO_INCREMENT value. We probably should not change that.
            2. Users might expect SHOW CREATE TABLE to return the AUTO_INCREMENT value that would be used next. Perhaps we should not change that either.
            3. The main purpose of the MDEV-6076 persistent AUTO_INCREMENT is to prevent the reuse of AUTO_INCREMENT values (say, if the last AUTO_INCREMENT values were DELETEd and the server was restarted).
            4. There is no promise that there would not be ‘gaps’ after a server restart. Gaps are already possible without involving a server restart, due to the batched allocation of values during multi-record INSERT.
            5. Replication should be immune to this, because the actually used AUTO_INCREMENT values will be part of the replication stream.

            One option would be to introduce an interface for reporting the persisted state of the AUTO_INCREMENT, instead of changing the operation of SHOW CREATE TABLE. This would seem to only make sense if implemented for all storage engines that support persistent AUTO_INCREMENT.

            I recommend simply documenting the current behaviour.

            marko Marko Mäkelä added a comment - Some further thoughts: Users probably expect LAST_INSERT_ID() to return the last successfully used AUTO_INCREMENT value. We probably should not change that. Users might expect SHOW CREATE TABLE to return the AUTO_INCREMENT value that would be used next. Perhaps we should not change that either. The main purpose of the MDEV-6076 persistent AUTO_INCREMENT is to prevent the reuse of AUTO_INCREMENT values (say, if the last AUTO_INCREMENT values were DELETEd and the server was restarted). There is no promise that there would not be ‘gaps’ after a server restart. Gaps are already possible without involving a server restart, due to the batched allocation of values during multi-record INSERT. Replication should be immune to this, because the actually used AUTO_INCREMENT values will be part of the replication stream. One option would be to introduce an interface for reporting the persisted state of the AUTO_INCREMENT, instead of changing the operation of SHOW CREATE TABLE. This would seem to only make sense if implemented for all storage engines that support persistent AUTO_INCREMENT. I recommend simply documenting the current behaviour.
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Ian Gilfillan [ greenman ]
            greenman Ian Gilfillan added a comment -

            I don't understand how to document this from a user perspective. SHOW CREATE TABLE does not seem to be as issue - it will correctly display the next value in both cases after the failed INSERT IGNORE, both before and after a restart. The issue is that there is a persistent autoinc value (with no interface to view it), and that after a server restart, this persistent value is being used to set the actual autoinc value.

            So, if I understand:

            • the persistent autoinc value has no effect except that is used to set the actual autoinc value after a restart.
            • the persistent autoinc value can go out of sync in the case of a failed INSERT IGNORE, as above.
            • presumably there are other instance relating to transactions where it can do the same.
            • so, essentially, the fix in 10.2.4, while now working for SET AUTO_INCREMENT=N, introduced a regression in that now autoinc values after a failed INSERT IGNORE are not 'correctly' maintained, whereas they were until 10.2.3.

            Is this correct? Is there any way for the user to view the persistent autoinc value?

            greenman Ian Gilfillan added a comment - I don't understand how to document this from a user perspective. SHOW CREATE TABLE does not seem to be as issue - it will correctly display the next value in both cases after the failed INSERT IGNORE, both before and after a restart. The issue is that there is a persistent autoinc value (with no interface to view it), and that after a server restart, this persistent value is being used to set the actual autoinc value. So, if I understand: the persistent autoinc value has no effect except that is used to set the actual autoinc value after a restart. the persistent autoinc value can go out of sync in the case of a failed INSERT IGNORE, as above. presumably there are other instance relating to transactions where it can do the same. so, essentially, the fix in 10.2.4, while now working for SET AUTO_INCREMENT=N, introduced a regression in that now autoinc values after a failed INSERT IGNORE are not 'correctly' maintained, whereas they were until 10.2.3. Is this correct? Is there any way for the user to view the persistent autoinc value?

            greenman, your interpretation sounds correct to me.
            It could be useful in debugging to view the current persistent auto_increment value. But generally, too many rarely used interfaces and parameters tend to be a bad idea.

            The regression would be that after a failed INSERT IGNORE and a restart, there would be a "gap" in the AUTO_INCREMENT value. Another case where you can get such a "gap" is a user-initiated ROLLBACK or ROLLBACK TO SAVEPOINT. But also that requires a server restart.
            Before MDEV-6076 introduced the persistent AUTO_INCREMENT in 10.2.4, we had the opposite problem that the server would not leave the "gap", but would instead perform the equivalent of SELECT MAX(auto_inc_column) when the table is first accessed. I can imagine that in a distributed system, you would definitely not want to reuse AUTO_INCREMENT values that may have been previously used somewhere else.

            So, this regression (gap after restart) would seem to be a desired property of MDEV-6076; in fact, I could claim that it is the main reason to implement MDEV-6076. (The other reason would be to avoid a potentially expensive index scan to determine MAX(autoinc_col), in case there are many NULL values or delete-marked records.)

            marko Marko Mäkelä added a comment - greenman , your interpretation sounds correct to me. It could be useful in debugging to view the current persistent auto_increment value. But generally, too many rarely used interfaces and parameters tend to be a bad idea. The regression would be that after a failed INSERT IGNORE and a restart, there would be a "gap" in the AUTO_INCREMENT value. Another case where you can get such a "gap" is a user-initiated ROLLBACK or ROLLBACK TO SAVEPOINT. But also that requires a server restart. Before MDEV-6076 introduced the persistent AUTO_INCREMENT in 10.2.4, we had the opposite problem that the server would not leave the "gap", but would instead perform the equivalent of SELECT MAX(auto_inc_column) when the table is first accessed. I can imagine that in a distributed system, you would definitely not want to reuse AUTO_INCREMENT values that may have been previously used somewhere else. So, this regression (gap after restart) would seem to be a desired property of MDEV-6076 ; in fact, I could claim that it is the main reason to implement MDEV-6076 . (The other reason would be to avoid a potentially expensive index scan to determine MAX(autoinc_col), in case there are many NULL values or delete-marked records.)
            greenman Ian Gilfillan made changes -
            Assignee Ian Gilfillan [ greenman ]
            greenman Ian Gilfillan added a comment -

            I have documented the current behaviour, though this still seems undesirable and perhaps it can be solved in another way.

            Without knowledge of the underlying implementation, does the following logic hold?

            • the single case that persistent autoinc is useful for is when ALTER TABLE yourTable AUTO_INCREMENT = x; is run, and then the server restarted immediately after that.
            • if there was a flag in the persistent autoinc, on for when autoinc is set with the above statement, off for when it's set any other way, then, when the server restarts, and the persistent autoinc is only used when the flag is on, it should be correct in all situations? An INSERT IGNORE will leave the flag off, so the autoinc will not be overwritten, and the gap won't exist. But if it is specifically set with an ALTER ..., the flag will be set, and then persistent autoinc can be used when the server restarts?
            greenman Ian Gilfillan added a comment - I have documented the current behaviour, though this still seems undesirable and perhaps it can be solved in another way. Without knowledge of the underlying implementation, does the following logic hold? the single case that persistent autoinc is useful for is when ALTER TABLE yourTable AUTO_INCREMENT = x; is run, and then the server restarted immediately after that. if there was a flag in the persistent autoinc, on for when autoinc is set with the above statement, off for when it's set any other way, then, when the server restarts, and the persistent autoinc is only used when the flag is on, it should be correct in all situations? An INSERT IGNORE will leave the flag off, so the autoinc will not be overwritten, and the gap won't exist. But if it is specifically set with an ALTER ..., the flag will be set, and then persistent autoinc can be used when the server restarts?

            greenman, I think that the main use case of persistent autoinc is a guarantee that the same values are not reassigned after server restart (except if the counter was reset with ALTER TABLE).

            It just occurred to me that the MyISAM and Aria engines (which featured persistent AUTO_INCREMENT from day one) have a uniquely defined AUTO_INCREMENT value: that of the last committed record. With table-level locking and with no support for rollback (except in Aria crash recovery), this works.

            InnoDB allows concurrent modifications to the table, and it supports rollback. Therefore, the AUTO_INCREMENT is necessarily different. The value that is persisted in the first root page of the table is the maximum value that was used in an INSERT or UPDATE. Even if the operation was rolled back, that value will remain.

            Another AUTO_INCREMENT is the value that is reported to the user in the SHOW CREATE TABLE statement. And yet another one is the LAST_INSERT_ID(). In InnoDB, these are non-persistent.

            elenst, what would you think if we extended INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS with a column that shows the persistent auto-increment value?

            marko Marko Mäkelä added a comment - greenman , I think that the main use case of persistent autoinc is a guarantee that the same values are not reassigned after server restart (except if the counter was reset with ALTER TABLE). It just occurred to me that the MyISAM and Aria engines (which featured persistent AUTO_INCREMENT from day one) have a uniquely defined AUTO_INCREMENT value: that of the last committed record. With table-level locking and with no support for rollback (except in Aria crash recovery), this works. InnoDB allows concurrent modifications to the table, and it supports rollback. Therefore, the AUTO_INCREMENT is necessarily different. The value that is persisted in the first root page of the table is the maximum value that was used in an INSERT or UPDATE. Even if the operation was rolled back, that value will remain. Another AUTO_INCREMENT is the value that is reported to the user in the SHOW CREATE TABLE statement. And yet another one is the LAST_INSERT_ID(). In InnoDB, these are non-persistent. elenst , what would you think if we extended INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS with a column that shows the persistent auto-increment value?
            greenman Ian Gilfillan added a comment -

            Marko's suggestion seems like a fairly quick and easy approach. Are there any objections to it?

            greenman Ian Gilfillan added a comment - Marko's suggestion seems like a fairly quick and easy approach. Are there any objections to it?

            Sorry, I missed the question addressed to me back then.

            We can add all kinds of columns to InnoDB I_S tables, I don't see how it will solve anything. People don't search for the auto-increment value for fun or out of curiosity, much less should they care about the dark secrets of implementation – what's "persistent" and what not. They should see the value which is actually true now, and it should be consistent. Now the obvious discrepancy in SHOW CREATE TABLE breaks simple consistency checks (tables before and after server restart differ), but it's not limited to it. AUTO_INCREMENT in SHOW CREATE TABLE is not cosmetics, it shows what happens next, and the behavior of auto-increment becomes inconsistent after the scenario above. If you keep normally inserting into the table before the restart, it will insert 4 etc. If you insert after restart, it will insert 101 etc.

            That said, I am not going to argue now against whichever decision is made, I'll leave it to be resolved by the natural course of events. We have already been questioned why such unexpected jumps can occur. So far it was by Monty, who, I suppose, relayed a question from customers; I expect it to happen again and eventually present itself in a form of an official customer request which we won't be able to ignore.

            elenst Elena Stepanova added a comment - Sorry, I missed the question addressed to me back then. We can add all kinds of columns to InnoDB I_S tables, I don't see how it will solve anything. People don't search for the auto-increment value for fun or out of curiosity, much less should they care about the dark secrets of implementation – what's "persistent" and what not. They should see the value which is actually true now, and it should be consistent. Now the obvious discrepancy in SHOW CREATE TABLE breaks simple consistency checks (tables before and after server restart differ), but it's not limited to it. AUTO_INCREMENT in SHOW CREATE TABLE is not cosmetics, it shows what happens next, and the behavior of auto-increment becomes inconsistent after the scenario above. If you keep normally inserting into the table before the restart, it will insert 4 etc. If you insert after restart, it will insert 101 etc. That said, I am not going to argue now against whichever decision is made, I'll leave it to be resolved by the natural course of events. We have already been questioned why such unexpected jumps can occur. So far it was by Monty, who, I suppose, relayed a question from customers; I expect it to happen again and eventually present itself in a form of an official customer request which we won't be able to ignore.
            marko Marko Mäkelä made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 81240 ] MariaDB v4 [ 143949 ]

            People

              Unassigned Unassigned
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.