Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6076

Persistent AUTO_INCREMENT for InnoDB

Details

    • 10.2.4-1

    Description

      The current auto_increment behavior in the InnoDB engine is sub-optimal. As it currently functions, the auto_increment value is stored until the server shuts down or resets, and then is rebuilt based on values in the table when it starts up again. Furthermore, in 5.6 this ought to become even worse, because tables can be evicted from the InnoDB data dictionary cache. We may get a too low auto-increment value even without shutdown/restart. When a table is evicted, InnoDB will forget the current auto-increment value, and it will do SELECT MAX(auto_inc_column) next time when the table is accessed.

      Attachments

        Issue Links

          Activity

            Current problem:

            When we update a auto_increment value for a table t1 on one client and auto_increment value for a table t2 on second client, this leads lock wait.

            jplindst Jan Lindström (Inactive) added a comment - Current problem: When we update a auto_increment value for a table t1 on one client and auto_increment value for a table t2 on second client, this leads lock wait.
            rjasdf Rick James (Inactive) added a comment - See also http://bugs.mysql.com/bug.php?id=199

            MySQL 8.0.0 implements WL#6204: InnoDB persistent max value for autoinc columns. The patch is complex, but one thing is better than in the contributed patch that I have seen.
            The contributed patch would write the AUTO_INCREMENT value straight to the clustered index B-tree root page at the previously unused position PAGE_MAX_TRX_ID. This would introduce block->lock contention on the root page.
            The Oracle patch would introduce a special buffering mechanism and a special redo log record type for writing the auto-increment values. On redo log checkpoint, the value would be written to a persistent table.

            I think that we can combine the two ideas as follows:

            1. We will have two fields to hold the auto-increment value: dict_table_t::autoinc for the next available value, and dict_table_t::persisted for the last value that has been physically written to the data file.
            2. Whenever InnoDB is about to assign an AUTO_INCREMENT value, it will write a redo log record to ‘update’ the field in the root page, but without actually modifying the page and without adding the page to the flush list.
            3. Whenever any mini-transaction is modifying the root page, it will write dict_table_t::autoinc to the page and update dict_table_t::autoinc_persisted to this value.
            4. On redo log checkpoint, if dict_table_t::autoinc_persisted differs from dict_table_t::autoinc for any table, update the value to the root page. We might need some data structure that would allow log checkpoint to quickly find tables or pages with ‘stale’ metadata.

            With the above approach, no changes should be necessary to crash recovery, backup tools and redo-log based replication. They would simply start scanning the redo log from some log checkpoint until the latest completed mini-transaction. If an auto-increment value was updated, there would be a redo log record for it. The last write would win; the value as of the latest completed mini-transaction would ultimately appear in the root page.

            The same approach should be possible for any dynamic metadata that we could imagine: UPDATE_TIME, maximum LSN or TRX_ID in a table, number of delete-marked records in an index, number of purgeable records, whatever we can imagine. We would use some bytes within an InnoDB tablespace for persisting the value, and we would have 2 fields in a main-memory struct: the last redo-logged value and the last value written to the page.
            We would write redo log records for updating this value, without actually updating the page or appending the page to the flush list. (If we happened to already be holding an X-latch on the page, then we could of course write it directly.)

            marko Marko Mäkelä added a comment - MySQL 8.0.0 implements WL#6204: InnoDB persistent max value for autoinc columns . The patch is complex, but one thing is better than in the contributed patch that I have seen. The contributed patch would write the AUTO_INCREMENT value straight to the clustered index B-tree root page at the previously unused position PAGE_MAX_TRX_ID. This would introduce block->lock contention on the root page. The Oracle patch would introduce a special buffering mechanism and a special redo log record type for writing the auto-increment values. On redo log checkpoint, the value would be written to a persistent table. I think that we can combine the two ideas as follows: We will have two fields to hold the auto-increment value: dict_table_t::autoinc for the next available value, and dict_table_t::persisted for the last value that has been physically written to the data file. Whenever InnoDB is about to assign an AUTO_INCREMENT value, it will write a redo log record to ‘update’ the field in the root page, but without actually modifying the page and without adding the page to the flush list. Whenever any mini-transaction is modifying the root page, it will write dict_table_t::autoinc to the page and update dict_table_t::autoinc_persisted to this value. On redo log checkpoint, if dict_table_t::autoinc_persisted differs from dict_table_t::autoinc for any table, update the value to the root page. We might need some data structure that would allow log checkpoint to quickly find tables or pages with ‘stale’ metadata. With the above approach, no changes should be necessary to crash recovery, backup tools and redo-log based replication. They would simply start scanning the redo log from some log checkpoint until the latest completed mini-transaction. If an auto-increment value was updated, there would be a redo log record for it. The last write would win; the value as of the latest completed mini-transaction would ultimately appear in the root page. The same approach should be possible for any dynamic metadata that we could imagine: UPDATE_TIME, maximum LSN or TRX_ID in a table, number of delete-marked records in an index, number of purgeable records, whatever we can imagine. We would use some bytes within an InnoDB tablespace for persisting the value, and we would have 2 fields in a main-memory struct: the last redo-logged value and the last value written to the page. We would write redo log records for updating this value, without actually updating the page or appending the page to the flush list. (If we happened to already be holding an X-latch on the page, then we could of course write it directly.)

            In other words, my idea is to split the main-memory representation of a buffer pool page into two parts: the base part, and a ‘shadow’ portion that is resides in some auxiliary data structure outside the page frame in the buffer pool. A correct implementation of the idea must fulfill the following invariants:

            1. Write-ahead logging (WAL): Any updates are first written to the redo log, then to the data files.
            2. Persistence across checkpoint: A checkpoint logically discards the head of the redo log. All changes that were present in the discarded portion of the redo log must be applied to the data files (alternatively, be appended to the redo log).
            3. Read observability: The ‘shadow’ portion must always be chosen instead of the base part.
            4. Write observability: When the page is being flushed from the buffer pool to the data file, the ‘shadow’ portion as it was at the flush LSN must be applied.
              There is one more complication in the implementation: page eviction. If we update the ‘shadow’ data while the page is not in the buffer pool, we would have to ensure that when the page is eventually read to the buffer pool, the ‘shadow’ data will be applied to it. On redo log checkpoint, we would have to read the page into the buffer pool in order to write back the ‘shadow’ data. It would seem simplest to prevent the eviction of pages from the buffer pool when ‘shadow’ data is present.
            marko Marko Mäkelä added a comment - In other words, my idea is to split the main-memory representation of a buffer pool page into two parts: the base part, and a ‘shadow’ portion that is resides in some auxiliary data structure outside the page frame in the buffer pool. A correct implementation of the idea must fulfill the following invariants: Write-ahead logging (WAL): Any updates are first written to the redo log, then to the data files. Persistence across checkpoint: A checkpoint logically discards the head of the redo log. All changes that were present in the discarded portion of the redo log must be applied to the data files (alternatively, be appended to the redo log). Read observability: The ‘shadow’ portion must always be chosen instead of the base part. Write observability: When the page is being flushed from the buffer pool to the data file, the ‘shadow’ portion as it was at the flush LSN must be applied. There is one more complication in the implementation: page eviction. If we update the ‘shadow’ data while the page is not in the buffer pool, we would have to ensure that when the page is eventually read to the buffer pool, the ‘shadow’ data will be applied to it. On redo log checkpoint, we would have to read the page into the buffer pool in order to write back the ‘shadow’ data. It would seem simplest to prevent the eviction of pages from the buffer pool when ‘shadow’ data is present.

            Premature optimization is the source of all evil. I think that we should run some benchmarks before starting to implement complicated logic.
            The motivation for the complexity is that the clustered index B-tree root page latch is already a contended resource.
            Maybe there is a simpler way to reduce the contention. In MySQL 5.7, WL#6363: InnoDB: implement SX-lock for rw_lock introduced infrastructure that was first put into use in WL#6326: InnoDB: fix index->lock contention, for the dict_index_t::lock and the buf_block_t::lock of the root page. The SX lock mode allows concurrent readers (S lock) but not concurrent writers (X or SX lock). In WL#6326, updates of the page allocation lists BTR_SEG_TOP and BTR_SEG_LEAF are protected by SX latch, not blocking concurrent readers of the root page.

            What if we updated the AUTO_INCREMENT value directly in the root page, but protected it with SX latch instead of X latch? What would the performance be in a workload that involves concurrent readers and writers and frequent page allocations? The test setup could be something like this:

            mysqld --skip-innodb-adaptive-hash-index --innodb-page-size=4k
            

            The above startup parameters ensure that readers will actually traverse the index tree and that the B-tree will consist of multiple levels, increasing the contention on the root page latch. The initialization could be something like this:

            CREATE TABLE t(a SERIAL, b TEXT) ENGINE=InnoDB;
            INSERT INTO t(b) VALUES(REPEAT('abc',1234)),(REPEAT('abc',1234)),(REPEAT('abc',1234));
            INSERT INTO t SELECT NULL,b FROM t;
            INSERT INTO t SELECT NULL,b FROM t;
            INSERT INTO t SELECT NULL,b FROM t;
            INSERT INTO t SELECT NULL,b FROM t;
            

            The workload would be something like this: A few readers executing

            SELECT SQL_NO_CACHE LENGTH(b) FROM t WHERE a=5;
            

            and one or more writers executing

            -- this should only modify the auto-increment counter in the root page, but not allocate/free pages
            BEGIN; INSERT INTO t SET b=''; ROLLBACK;
            

            or (in another benchmark)

            -- this causes access to BTR_SEG_LEAF (allocate&free the BLOB)
            BEGIN; INSERT INTO t SET b=REPEAT('blob',12345); ROLLBACK;
            

            or (yet another benchmark)

            -- this should cause page splits&merges
            BEGIN; INSERT INTO t(b) SELECT b FROM t; ROLLBACK;
            

            With the benchmark, I would like to see if there is any noticeable performance regression from the simple patch (write the auto-increment counter directly to the root page, protected by SX-latch) with respect to the baseline (no persistent auto-increment counter).

            marko Marko Mäkelä added a comment - Premature optimization is the source of all evil. I think that we should run some benchmarks before starting to implement complicated logic. The motivation for the complexity is that the clustered index B-tree root page latch is already a contended resource. Maybe there is a simpler way to reduce the contention. In MySQL 5.7, WL#6363: InnoDB: implement SX-lock for rw_lock introduced infrastructure that was first put into use in WL#6326: InnoDB: fix index->lock contention , for the dict_index_t::lock and the buf_block_t::lock of the root page. The SX lock mode allows concurrent readers (S lock) but not concurrent writers (X or SX lock). In WL#6326, updates of the page allocation lists BTR_SEG_TOP and BTR_SEG_LEAF are protected by SX latch, not blocking concurrent readers of the root page. What if we updated the AUTO_INCREMENT value directly in the root page, but protected it with SX latch instead of X latch? What would the performance be in a workload that involves concurrent readers and writers and frequent page allocations? The test setup could be something like this: mysqld --skip-innodb-adaptive-hash-index --innodb-page-size=4k The above startup parameters ensure that readers will actually traverse the index tree and that the B-tree will consist of multiple levels, increasing the contention on the root page latch. The initialization could be something like this: CREATE TABLE t(a SERIAL, b TEXT) ENGINE=InnoDB; INSERT INTO t(b) VALUES(REPEAT('abc',1234)),(REPEAT('abc',1234)),(REPEAT('abc',1234)); INSERT INTO t SELECT NULL,b FROM t; INSERT INTO t SELECT NULL,b FROM t; INSERT INTO t SELECT NULL,b FROM t; INSERT INTO t SELECT NULL,b FROM t; The workload would be something like this: A few readers executing SELECT SQL_NO_CACHE LENGTH(b) FROM t WHERE a=5; and one or more writers executing -- this should only modify the auto-increment counter in the root page, but not allocate/free pages BEGIN; INSERT INTO t SET b=''; ROLLBACK; or (in another benchmark) -- this causes access to BTR_SEG_LEAF (allocate&free the BLOB) BEGIN; INSERT INTO t SET b=REPEAT('blob',12345); ROLLBACK; or (yet another benchmark) -- this should cause page splits&merges BEGIN; INSERT INTO t(b) SELECT b FROM t; ROLLBACK; With the benchmark, I would like to see if there is any noticeable performance regression from the simple patch (write the auto-increment counter directly to the root page, protected by SX-latch) with respect to the baseline (no persistent auto-increment counter).
            zhangyuan zhangyuan added a comment -

            Use SX lock is a good idea! But so far we have not supported SX lock in AliSQL5.6. I made a sysbench test in AliSQL5.6, the result shows that the impact of x latch is so little.

            ./bin/sysbench
            --test=insert.lua 
            --num-threads=100  
            --max-time=7200 
            --max-requests
            

            with different oltp-tables-count

            • one table
              innodb_autoinc_persistent=OFF innodb_autoinc_persistent=ON
            innodb_autoinc_persistent_interval=1
            innodb_autoinc_persistent=ON
            innodb_autoinc_persistent_interval=10
            innodb_autoinc_persistent=ON
            innodb_autoinc_persistent_interval=100
            TPS 22199 22003 22069 22209
            RT(ms) 2.25 2.27 2.26 2.25
            • 256 tables
              innodb_autoinc_persistent=OFF innodb_autoinc_persistent=ON
            innodb_autoinc_persistent_interval=1
            innodb_autoinc_persistent=ON
            innodb_autoinc_persistent_interval=10
            innodb_autoinc_persistent=ON
            innodb_autoinc_persistent_interval=100
            TPS 24609 24444 24579 24605
            RT(ms) 2.03 2.03 2.03 2.03

            summary
            innodb_autoinc_persistent=ON, innodb_autoinc_persistent_interval=1 performance degrade below %1。
            innodb_autoinc_persistent=ON, innodb_autoinc_persistent_interval=100 almost no effect。

            zhangyuan zhangyuan added a comment - Use SX lock is a good idea! But so far we have not supported SX lock in AliSQL5.6. I made a sysbench test in AliSQL5.6, the result shows that the impact of x latch is so little. ./bin/sysbench --test=insert.lua --num-threads= 100 --max-time= 7200 --max-requests with different oltp-tables-count one table   innodb_autoinc_persistent=OFF innodb_autoinc_persistent=ON innodb_autoinc_persistent_interval=1 innodb_autoinc_persistent=ON innodb_autoinc_persistent_interval=10 innodb_autoinc_persistent=ON innodb_autoinc_persistent_interval=100 TPS 22199 22003 22069 22209 RT(ms) 2.25 2.27 2.26 2.25 256 tables   innodb_autoinc_persistent=OFF innodb_autoinc_persistent=ON innodb_autoinc_persistent_interval=1 innodb_autoinc_persistent=ON innodb_autoinc_persistent_interval=10 innodb_autoinc_persistent=ON innodb_autoinc_persistent_interval=100 TPS 24609 24444 24579 24605 RT(ms) 2.03 2.03 2.03 2.03 summary innodb_autoinc_persistent=ON, innodb_autoinc_persistent_interval=1 performance degrade below %1。 innodb_autoinc_persistent=ON, innodb_autoinc_persistent_interval=100 almost no effect。

            Thanks, these results are encouraging.
            Because my ‘shadow write’ idea would use the same file format, we could go with the simpler solution first, and then refine it later if performance problems arise. I would not introduce any configuration options, but instead make the AUTO_INCREMENT counter persistent, unconditionally.

            Somewhat similar to the MySQL 8.0.0 implementation, I would persist the auto-increment counter in row_ins_clust_index_entry_low(), letting btr_cur_search_to_nth_level() acquire and hold the root page SX-latch or X-latch for us.

            marko Marko Mäkelä added a comment - Thanks, these results are encouraging. Because my ‘shadow write’ idea would use the same file format, we could go with the simpler solution first, and then refine it later if performance problems arise. I would not introduce any configuration options, but instead make the AUTO_INCREMENT counter persistent, unconditionally. Somewhat similar to the MySQL 8.0.0 implementation, I would persist the auto-increment counter in row_ins_clust_index_entry_low(), letting btr_cur_search_to_nth_level() acquire and hold the root page SX-latch or X-latch for us.

            The persistent AUTO_INCREMENT counter will introduce some changes to semantics. Some of them correspond to changes introduced in MySQL 8.0.0, while others are unique to the implementation in MariaDB. First the common changes:

            1. Normally, AUTO_INCREMENT counters are monotonically increasing.
            2. Assigned values will not be reused even after a transaction rollback.
            3. If the server is killed before committing a transaction for which an AUTO_INCREMENT value was assigned, it is possible (but not guaranteed) that the same value can be reassigned again after crash recovery.
            4. Like in MyISAM, UPDATE of an AUTO_INCREMENT column will move the AUTO_INCREMENT sequence if the updated value is larger.
            5. ALTER TABLE…AUTO_INCREMENT will not change the counter to a smaller value than what already exists in the column. (This is for compatibility with ALGORITHM=COPY.)
              Finally, there is one possible deviation of what the final implementation (with atomic DDL operations using the Global Data Dictionary) will be in MySQL 8.0GA:
            6. ALTER TABLE…AUTO_INCREMENT will update the persistent auto-increment sequence before the transaction is committed. If the server is killed before the commit, it is possible (but not guaranteed) that the auto-increment sequence has already been reset.
            marko Marko Mäkelä added a comment - The persistent AUTO_INCREMENT counter will introduce some changes to semantics. Some of them correspond to changes introduced in MySQL 8.0.0, while others are unique to the implementation in MariaDB. First the common changes: Normally, AUTO_INCREMENT counters are monotonically increasing. Assigned values will not be reused even after a transaction rollback. If the server is killed before committing a transaction for which an AUTO_INCREMENT value was assigned, it is possible (but not guaranteed) that the same value can be reassigned again after crash recovery. Like in MyISAM, UPDATE of an AUTO_INCREMENT column will move the AUTO_INCREMENT sequence if the updated value is larger. ALTER TABLE…AUTO_INCREMENT will not change the counter to a smaller value than what already exists in the column. (This is for compatibility with ALGORITHM=COPY.) Finally, there is one possible deviation of what the final implementation (with atomic DDL operations using the Global Data Dictionary) will be in MySQL 8.0GA: ALTER TABLE…AUTO_INCREMENT will update the persistent auto-increment sequence before the transaction is committed. If the server is killed before the commit, it is possible (but not guaranteed) that the auto-increment sequence has already been reset.

            Please review the branch bb-10.2-mdev-6076.

            marko Marko Mäkelä added a comment - Please review the branch bb-10.2-mdev-6076.

            mysql_upgrade will need some work to set the AUTO_INCREMENT attributes for InnoDB tables.

            We may also want to file a follow-up task for removing the requirement of having an index on AUTO_INCREMENT columns.

            marko Marko Mäkelä added a comment - mysql_upgrade will need some work to set the AUTO_INCREMENT attributes for InnoDB tables. We may also want to file a follow-up task for removing the requirement of having an index on AUTO_INCREMENT columns.

            I think that we can do upgrade from old data files transparently (without additions to mysql_upgrade) as follows:

            If PAGE_ROOT_AUTO_INC is 0, the table must be either empty or in old format. If the table is nonempty and an index on AUTO_INCREMENT column exists, initialize dict_table_t::autoinc from MAX(auto_increment_column) as we used to. Else, effectively let the AUTO_INCREMENT sequence start from 1.

            Note: I filed MDEV-11578 to remove the requirement for AUTO_INCREMENT columns to be indexed.

            marko Marko Mäkelä added a comment - I think that we can do upgrade from old data files transparently (without additions to mysql_upgrade) as follows: If PAGE_ROOT_AUTO_INC is 0, the table must be either empty or in old format. If the table is nonempty and an index on AUTO_INCREMENT column exists, initialize dict_table_t::autoinc from MAX(auto_increment_column) as we used to. Else, effectively let the AUTO_INCREMENT sequence start from 1. Note: I filed MDEV-11578 to remove the requirement for AUTO_INCREMENT columns to be indexed.

            I was unsure if PAGE_ROOT_AUTO_INC always was 0 on other than secondary index leaf pages. (Before MySQL 5.5, InnoDB had the bad habit of writing uninitialized garbage to unused data fields in pages.) Luckily it turns out that page_create() in mysql-3.23.49 initializes PAGE_MAX_TRX_ID to 0 whenever it is creating a B-tree page.

            This means that the upgrade method that I implemented (if PAGE_ROOT_AUTO_INC is 0 and the root page is not empty, read MAX(auto_inc_column)) is sound.

            I can only think of one possible glitch: Import a data file from a MDEV-6076 server to one that lacks MDEV-6076, then insert or update some rows using a larger value than what was persisted, and finally import back to a MDEV-6076 server. In this case, one extra statement would be needed to adjust the AUTO_INCREMENT value:

            ALTER TABLE t DISCARD TABLESPACE;
            ALTER TABLE t IMPORT TABLESPACE;
            ALTER TABLE t AUTO_INCREMENT=1;
            

            The last statement would reset the persistent AUTO_INCREMENT sequence to the current maximum value. (MDEV-11578 may change the semantics.)

            marko Marko Mäkelä added a comment - I was unsure if PAGE_ROOT_AUTO_INC always was 0 on other than secondary index leaf pages. (Before MySQL 5.5, InnoDB had the bad habit of writing uninitialized garbage to unused data fields in pages.) Luckily it turns out that page_create() in mysql-3.23.49 initializes PAGE_MAX_TRX_ID to 0 whenever it is creating a B-tree page. This means that the upgrade method that I implemented (if PAGE_ROOT_AUTO_INC is 0 and the root page is not empty, read MAX(auto_inc_column)) is sound. I can only think of one possible glitch: Import a data file from a MDEV-6076 server to one that lacks MDEV-6076 , then insert or update some rows using a larger value than what was persisted, and finally import back to a MDEV-6076 server. In this case, one extra statement would be needed to adjust the AUTO_INCREMENT value: ALTER TABLE t DISCARD TABLESPACE; ALTER TABLE t IMPORT TABLESPACE; ALTER TABLE t AUTO_INCREMENT=1; The last statement would reset the persistent AUTO_INCREMENT sequence to the current maximum value. ( MDEV-11578 may change the semantics.)
            zhangyuan zhangyuan added a comment -

            When delete all records from root page(not by truncate), root page will recreate by page_create_empty. PAGE_ROOT_AUTO_INC will be reset to 0.

            diff --git a/storage/innobase/page/page0page.cc b/storage/innobase/page/page0page.cc
            index d207371..af913bc 100644
            --- a/storage/innobase/page/page0page.cc
            +++ b/storage/innobase/page/page0page.cc
            @@ -531,8 +531,7 @@
                    same temp-table in parallel.
                    max_trx_id is ignored for temp tables because it not required
                    for MVCC. */
            -       if (dict_index_is_sec_or_ibuf(index)
            -           && !dict_table_is_temporary(index->table)
            +       if (!dict_table_is_temporary(index->table)
                        && page_is_leaf(page)) {
                            max_trx_id = page_get_max_trx_id(page);
                            ut_ad(max_trx_id);
            

            zhangyuan zhangyuan added a comment - When delete all records from root page(not by truncate), root page will recreate by page_create_empty. PAGE_ROOT_AUTO_INC will be reset to 0. diff --git a/storage/innobase/page/page0page.cc b/storage/innobase/page/page0page.cc index d207371..af913bc 100644 --- a/storage/innobase/page/page0page.cc +++ b/storage/innobase/page/page0page.cc @@ -531,8 +531,7 @@ same temp-table in parallel. max_trx_id is ignored for temp tables because it not required for MVCC. */ - if (dict_index_is_sec_or_ibuf(index) - && !dict_table_is_temporary(index->table) + if (!dict_table_is_temporary(index->table) && page_is_leaf(page)) { max_trx_id = page_get_max_trx_id(page); ut_ad(max_trx_id);

            Thanks, Zhangyuan! Yes, page_create_empty() or its callers must be adjusted so that PAGE_ROOT_AUTO_INC will be preserved.
            I will add a test for this. I made a similar omission in page reorganize, but that was caught by one test.

            marko Marko Mäkelä added a comment - Thanks, Zhangyuan! Yes, page_create_empty() or its callers must be adjusted so that PAGE_ROOT_AUTO_INC will be preserved. I will add a test for this. I made a similar omission in page reorganize, but that was caught by one test.

            Some more code changes are needed to accommodate virtual columns (fix the test failures in --suite=vcol).
            With the InnoDB virtual column implementation in MySQL 5.7, Field::field_index can no longer be passed to dict_table_get_nth_col(). We will have to calculate how many non-virtual columns precede the column.

            marko Marko Mäkelä added a comment - Some more code changes are needed to accommodate virtual columns (fix the test failures in --suite=vcol). With the InnoDB virtual column implementation in MySQL 5.7, Field::field_index can no longer be passed to dict_table_get_nth_col(). We will have to calculate how many non-virtual columns precede the column.

            I pushed a fix to the conflict with MDEV-5800/WL#8114/WL#8149 (virtual columns).
            The new function innodb_col_no() will determine the dict_table_t::cols[] offset of a Field.

            marko Marko Mäkelä added a comment - I pushed a fix to the conflict with MDEV-5800 /WL#8114/WL#8149 (virtual columns). The new function innodb_col_no() will determine the dict_table_t::cols[] offset of a Field.

            This looks good now.

            jplindst Jan Lindström (Inactive) added a comment - This looks good now.

            Thanks! I just pushed to bb-10.2-mdev-6076 rebased to latest 10.2, so that we avoid one test failure due to a bug in slow shutdown that was introduced in MDEV-5800 and fixed later.

            The only code change since the review is the correction of a debug assertion that I had added late yesterday:

            diff --git a/storage/innobase/page/page0zip.cc b/storage/innobase/page/page0zip.cc
            index 39eadda318f..798aeeed74f 100644
            --- a/storage/innobase/page/page0zip.cc
            +++ b/storage/innobase/page/page0zip.cc
            @@ -4851,8 +4851,8 @@ page_zip_copy_recs(
             	} else {
             		/* The PAGE_MAX_TRX_ID must be nonzero on leaf pages
             		of secondary indexes, and 0 on others. */
            -		ut_a(dict_table_is_temporary(index->table)
            -		     || page_is_leaf(src) == !page_get_max_trx_id(src));
            +		ut_ad(dict_table_is_temporary(index->table)
            +		      || !page_is_leaf(src) == !page_get_max_trx_id(src));
             	}
             
             	/* Copy all fields of src_zip to page_zip, except the pointer
            

            I also removed many restarts from the test innodb.autoinc_persist, testing more things across each restart.

            marko Marko Mäkelä added a comment - Thanks! I just pushed to bb-10.2-mdev-6076 rebased to latest 10.2, so that we avoid one test failure due to a bug in slow shutdown that was introduced in MDEV-5800 and fixed later. The only code change since the review is the correction of a debug assertion that I had added late yesterday: diff --git a/storage/innobase/page/page0zip.cc b/storage/innobase/page/page0zip.cc index 39eadda318f..798aeeed74f 100644 --- a/storage/innobase/page/page0zip.cc +++ b/storage/innobase/page/page0zip.cc @@ -4851,8 +4851,8 @@ page_zip_copy_recs( } else { /* The PAGE_MAX_TRX_ID must be nonzero on leaf pages of secondary indexes, and 0 on others. */ - ut_a(dict_table_is_temporary(index->table) - || page_is_leaf(src) == !page_get_max_trx_id(src)); + ut_ad(dict_table_is_temporary(index->table) + || !page_is_leaf(src) == !page_get_max_trx_id(src)); } /* Copy all fields of src_zip to page_zip, except the pointer I also removed many restarts from the test innodb.autoinc_persist, testing more things across each restart.

            Now that 10.2.3 was released and 10.1 was merged to 10.2, I finally pushed this.

            marko Marko Mäkelä added a comment - Now that 10.2.3 was released and 10.1 was merged to 10.2, I finally pushed this.

            MDEV-12123 fixed a related bug. When using IMPORT TABLESPACE with data files on which IMPORT TABLESPACE has been used before the MDEV-12123 fix, the AUTO_INCREMENT value will be restored to a bogus value (something that was a transaction ID in the previous IMPORT).

            marko Marko Mäkelä added a comment - MDEV-12123 fixed a related bug. When using IMPORT TABLESPACE with data files on which IMPORT TABLESPACE has been used before the MDEV-12123 fix, the AUTO_INCREMENT value will be restored to a bogus value (something that was a transaction ID in the previous IMPORT).

            People

              marko Marko Mäkelä
              jplindst Jan Lindström (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.