[MDEV-33277] In-place migration from MySQL 5.7 causes invalid AUTO_INCREMENT values Created: 2024-01-19 Updated: 2024-02-08 |
|
| Status: | Stalled |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.11, 11.0, 11.1, 11.2, 11.3, 11.4, 10.6.14 |
| Fix Version/s: | 10.5, 10.6, 10.11, 11.0, 11.1, 11.2, 11.3, 11.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | John Flatness | Assignee: | Marko Mäkelä |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
After a migration of an RDS MySQL 5.7 instance with a large number of tables to MariaDB 10.6.14, significant numbers of tables had incorrect auto_increment values selected that were less than the maximum value of the column. This application heavily uses IODKU, so too-low auto_increment has quite significant negative effects, causing overwritten rows when creation of new ones is instead expected. This problem occurred across many different tables with different structures and columns (though the auto_increment column in all cases was a simple int primary key). All used the InnoDB engine. The problem does not appear immediately; I attempted to discover and manually change the auto_increment values of affected tables using information_schema.tables to look for tables where auto_increment was less than or equal to the max of the relevant column. As best I could tell, the auto_increment only became "wrong" once an insert (or possibly any update) was made. The result was that a scan over all tables for this problem would fix some tables, then later others would show the problem. Ultimately the only solution was to run ALTER TABLE x AUTO_INCREMENT = y for every table, even where the new autoinc value "y" was the same as what information_schema and SHOW CREATE TABLE reported already. As far as I can tell the problem no longer occurs after taking this step. I'm aware that since The affected tables ranged from ones that were several years old and originally created by older MySQL versions to those created by 5.7. Prior discussion on other issues leads me to believe it's therefore unlikely that the source of the problem is junk data written into the field repurposed for the persistent autoincrement value. Unfortunately as this is an RDS instance I don't have direct access to the data files to investigate. I only noted values that were too low (though admittedly too high would not have drawn the same kind of notice). As far as I can tell the initialization of the in-memory autoincrement goes wrong in these cases. My working theories include that initialization is setting the value to the first "gap" in the column, rather than above the max, or alternatively, the first INSERT/IODKU is "setting" the autoinc when it shouldn't. Some discussion in the prior issues noted that since 10.2 is getting old, this issue while still not solved wouldn't really occur much anymore. I filed this in part just to note that moving from MySQL continues to create a "vector" for this problem. |
| Comments |
| Comment by Marko Mäkelä [ 2024-01-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
If the data file had originally been created before MySQL 5.1.48, then this could share an explanation with However, a more likely explanation could be that the adjustment By the way, just like with upgrades from MySQL 5.6 to 5.7, you may expect some glitches related to undo logs, because in MySQL 5.7 the format was changed in an incompatible way. In MariaDB Server 10.2, the 5.6 compatible format was retained; see A possible fix would be to execute
on every InnoDB table that contains an AUTO_INCREMENT column. This should update the counter to the current maximum value of the column. Would that work for you? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by John Flatness [ 2024-01-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
In terms of my own situation, I've already done an ALTER TABLE...AUTO_INCREMENT on everything (though I manually calculated the desired value rather than using 1). As far as I can tell that's resolved the problem. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2024-01-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Regarding ALTER TABLE t AUTO_INCREMENT=1; We could add to ha_check_for_upgrade() that if the table has an auto_increment column and the table was created with either of the above versions, we could return HA_ADMIN_NEEDS_CHECK which would ensure that the handler::check() functions is called. InnoDB could then check that the cached auto_increment column looks reasonable and fix it if not. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by John Flatness [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Just in my own situation, if mariadb_upgrade had set the persistent auto_increment for all these tables it would have avoided my problems (I presume, anyway). In the course of writing this message I was able to trigger the problem at least one way through poking at a much smaller staging server which had also been brought over from MySQL 5.7 to MariaDB 10.6. The table in question (though the details of the table don't seem to matter much)
and its contents (it hadn't been touched since the upgrade):
On first inspection of the running server, SHOW CREATE TABLE shows this table with an AUTO_INCREMENT of 22, as expected. On info from another related issue, I ran
on the table's ibd file to check/confirm the persistent autoinc value: as expected for an untouched table from MySQL 5.7, that returned all zeros. On a hunch as to the mechanism of the problem, I ran several close variants of this query (a simplification of what the application often runs when editing existing rows):
(I assume the last_insert_id() call here is irrelevant but it is in the queries the application uses so I included it in my tests) The displayed AUTO_INCREMENT value from SHOW CREATE TABLE was still correct throughout this process after running several IODKUs. I'd noticed from poking around at another pre-5.7 table that the ibd file didn't immediately show a change to the persistent value even after an autoincrementing insertion; figuring this was because the write was not yet on disk, I ran FLUSH TABLES and ultimately restarted the MariaDB server. On restart it now shows the problem: SHOW CREATE TABLE now shows AUTO_INCREMENT=4, and an IODKU at that point passing a NULL id indeed causes an update of the row with id 4 rather than creating a new one. In other words, the ID passed to those earlier IODKUs had been used as the "last" value for determining the AUTO_INCREMENT somewhere. An important detail: when checked again after the restart, that value of 3 was written out to the table on disk as the persistent autoinc (as shown by the "od" command). This test isn't an exact match for what happened on the live server. The live server hadn't been restarted at all since the upgrade but still showed this issue, but it has much more traffic and many more tables than the table open cache... I assume my restart of this smaller server maybe simulated a flushing of the table that could have happened on the live server just due to things being ejected from working sets or caches and written to disk. I'll have to check some more to see if just this is sufficient but the heart of the issue seems to be that running the IODKU that was actually an UPDATE set the persistent autoinc (but not the in-memory one) to the value used in the query. This matches with the behavior I saw where the "bad" autoinc values were always within the bounds of used IDs on the tables (in other words, never "too high," seemingly not random). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by John Flatness [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
OK, I tried this again in a simplified form, this time just this query one time (on another untouched copy of a table with the same structure):
and restarted the server. It shows the exact same symptoms: "od" shows 0x3 written into the ibd after the restart, and SHOW CREATE TABLE reports the AUTO_INCREMENT at 4, well under the max for the column and itself a conflicting value with an existing row. So no last_insert_id(), no multiple queries, no flush tables, etc, just the one IODKU and a restart. Still not really certain this is the exact problem as what I saw in production (not sure how, or if it's even possible, to have the in-memory autoinc counter reinitialized without a restart) but it's a way to get at least the same kind of result. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you for confirming that the workaround works. I think that this should also affect upgrades from MariaDB Server 10.1 or earlier. I think that one customer did hit this problem when upgrading from 10.1 to 10.3, but sadly I did not remember the change The change
When using a newer version of GCC (I currently normally use 13.2.0), something in the C++ library would cause a conflict with byte somewhere. I was so far unable to produce a nonzero PAGE_MAX_TRX_ID a.k.a. PAGE_ROOT_AUTO_INC on the clustered index root page. I think that I’d better compile mysql-5.7.44 (the last release of the series) to reproduce this. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I did not get a nonzero field from MySQL 5.7.44 either, but I will keep trying. Maybe there is a difference with regard to trailing space. On my system, with both MySQL 5.7.44 and MariaDB Server "10.1.49" (which was never released), the table data would end at byte offset 0xcedd, well before the start of the page directory (0xffee). I think that I must grow the records a little, so that a page split will occur. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by John Flatness [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Is this what I would have been checking with
? If so, please note that the affected tables here have that value at zero. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Here is a simplified version of the SQL that produces a root page with the equivalent contents:
Because the duplicate key error would be flagged on the PRIMARY KEY, we might as well issue an UPDATE statement straight ahead. I then found a threshold for a page split by modifying the test as follows:
If I change the 1575 to 1576 in the UPDATE statement, the page will be split (the 16-bit field PAGE_LEVEL at 0xc040 will be 0x0001 instead of 0x0000), but the PAGE_MAX_TRX_ID will remain at 0. With the 1575, there will be 4 wasted bytes between the end of the last record and the sparse page directory (0xffea to 0xffed). I will have to check | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
zerocrates, yes, the PAGE_MAX_TRX_ID field of interest is the 8 bytes at the byte offset 0xc038 (3*16384 + 38 + 18). I guess that the problem is not that those bytes would be nonzero garbage, but something wrong in the upgrade logic. I’ll move to testing that. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I tried the following test on 10.6. It is showing the expected LAST_INSERT_ID() of 43.
I also tried changing the value written to the PAGE_MAX_TRX_ID to something else, like this:
The LAST_INSERT_ID() would be reported as 102, just like expected. So, it looks like there really must have been some garbage PAGE_MAX_TRX_ID in the file, but we do not know yet how. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I started a fishing expedition by running MySQL 5.7.44 compiled with the following patch, to catch any clustered index root page writes where the PAGE_MAX_TRX_ID would be nonzero:
This could create false alarms when using nonzero innodb_undo_tablespaces. But, no tests in the default suites failed due to this. The commit message of
This test reproduces the issue:
Before
I would see a new record with id=1294 or id=1289, instead of the expected id=43. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Because MariaDB Server 10.4 is soon reaching its end of life (EOL), I do not think that it makes sense to fix this bug there. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
monty, I think that I need your help with regard to the changes outside InnoDB. My suggested fix would be as follows: When opening a table (maybe only as part of running an upgrade script), if the .frm file indicates that there is an AUTO_INCREMENT column in an InnoDB table and the table definition had been created or last modified before MariaDB Server 10.2.4 or by MySQL, we’d reset the PAGE_ROOT_AUTO_INC in the clustered index root page, and possibly mark the .frm file in such a way that the check will only be executed once. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by John Flatness [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
So, in this comment in particular I described my reproducer: https://jira.mariadb.org/browse/MDEV-33277?focusedCommentId=278594&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-278594 It starts with a table with a zero PAGE_ROOT_AUTO_INC (coming from MySQL) and just doing one "updating" IODKU in 10.6 and restarting the server causes the field to be written as the duplicate value of the key. I assume this is a pretty much totally different route that leads to similar results? As yours is about creating a table that will come into the upgrade with a spurious value in that field, but mine starts with zero there and seems to be more a logic issue in the code that intentionally sets the persistent autoinc, if I had to hazard a guess. Regardless, I'd imagine your proposed solution would address both, plus maybe other undiscovered routes to the same result, by setting a proper PAGE_ROOT_AUTO_INC upfront. So the different methods of reproduction might not matter much, I suppose. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
zerocrates, I think that the value that is written is the last assigned value. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2024-01-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
zerocrates I am not sure what you mean with your last comment "restarting the server causes the field to be written as the duplicate value of the key". | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by John Flatness [ 2024-01-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The sequence goes like this: First the table has no PAGE_ROOT_AUTO_INC value. Server reports AUTO_INCREMENT as MAX(pk) + 1. I do a INSERT... ON DUPLICATE KEY UPDATE that doesn't insert (I choose a value for the PK that is in use, so it performs an update). The server reports AUTO_INCREMENT still at the same value as before. Restart the server. PAGE_ROOT_AUTO_INC for the table's .ibd file now contains the value used in the IODKU (the pk of the updated row). The server now reports AUTO_INCREMENT for the table as that PAGE_ROOT_AUTO_INC value plus 1. The AUTO_INCREMENT is now "too low." A normal INSERT not specifying a value for the PK at this point produces "ERROR 1062 (23000): Duplicate entry 'x' for key 'PRIMARY'". Doing an IODKU instead with NULL as the PK value does an UPDATE on an existing row rather than doing an insert as expected. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The field PAGE_ROOT_AUTO_INC is supposed to contain the last assigned AUTO_INCREMENT value of the table, provided it is larger than the previous value of the field. It is supposed to be set also if an INSERT or UPDATE statement specifies a value for the AUTO_INCREMENT column. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Based on discussion with monty, I created https://github.com/MariaDB/server/pull/3022, currently in draft state. The first commit would automatically correct obviously wrong AUTO_INCREMENT values. Correcting values that are larger than the current maximum value of the AUTO_INCREMENT column in the table would require CHECK TABLE…FOR UPGRADE to be implemented for InnoDB. That is what the second commit aims and currently fails to do. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I don’t think that we need to implement CHECK TABLE … FOR UPGRADE in InnoDB to fix this. A few examples, for cases where the table had been created before MariaDB 10.2.10:
Any other cases can be fixed manually by executing ALTER TABLE…AUTO_INCREMENT=1. If the column is defined as BIGINT UNSIGNED AUTO_INCREMENT, then the worst damage that can happen is that we the counter will jump close to 2⁴⁸ (assuming that the nonzero values only come from ALTER TABLE…IMPORT TABLESPACE and are transaction identifiers. In this case, we will still have 2⁶⁴-2⁴⁸=1.8664e19 available values for the AUTO_INCREMENT column. Note: the full range 2⁶⁴ is not much bigger than that: 1.8667e19. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I filed MDEV-33308 for a CHECK TABLE anomaly that I found while writing a regression test for this. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2024-01-26 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I have updated 10.5-MDEV-33277 with changes to do proper reporting with CHECK TABLE, CHECK TABLE ... FOR UPGRADE and mariadb-upgrade | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Debarun Banerjee [ 2024-02-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
https://github.com/MariaDB/server/pull/3022 looks fine to me. |