[MDEV-32132] DROP INDEX followed by CREATE INDEX may corrupt data Created: 2023-09-08 Updated: 2023-12-01 Resolved: 2023-09-08 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.2.2, 10.2, 10.3, 10.4 |
| Fix Version/s: | 10.4.32 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | corruption, crash, upstream | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
In The scenario is as follows (adapting from
Notes:
The bug in index creation is that ibuf_set_bitmap_for_bulk_load() would fail to invoke ibuf_delete_recs() to remove any stale buffered entries for the page that is being reused for the being-created secondary index. When This ticket is about porting the fix of ibuf_set_bitmap_for_bulk_load() to MariaDB Server 10.4. |
| Comments |
| Comment by Manuel Arostegui [ 2023-09-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
We believe we've been hit by this bug when upgrading to 10.4.31. We saw one crash on an specific table with:
We thought it could have been a one time thing and we recloned the host. Today we saw the same crash again, with the same table
After spending quite some time with marko today with some gdb traces, it looks like we've been hit by this bug on this table which had a DROP+CREATE index a few years ago and where we had change buffer enabled.
Once that table was optimized, replication started to flow again, but the host crashed with the same exception, but this time with a different table, which also had a schema change a few years ago, and which has revealed to have corrupted indexes:
For now we are optimizing all the tables in this host. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Manuel Arostegui [ 2023-09-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For the record, even after optimizing the broken tables (and they were being reported ok by a check table) the host kept crashing with the same issue (and same table). I've downgraded the host back to its previous version 10.4.28, recloned it from the original host it was recloned when it crashed, and so far no crashes. There must be something more specific within 10.4.31 and 10.4.28 and maybe that table specific usage/data/schema that makes it keeps crashing on 10.4.31 but not on 10.4.28. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
marostegui, there are two InnoDB global variables that will prevent OPTIMIZE TABLE from actually rebuilding the table. ALTER TABLE…FORCE is immune to those settings:
If neither parameter applies here, I do not know what could be the problem. Any table rebuild will assign a new tablespace ID and therefore disassociate any old change buffer entries with the new data file. I think that further root cause analysis of the corruption would be needed. Crashes on corruption were fixed in MariaDB Server 10.6 in | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Manuel Arostegui [ 2023-09-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for following up Marko. Neither of those variables are set, and we indeed do not have a FULL TEXT index on that table.... We are in process of migrating from 10.4 to 10.6 but this will take many months yet, so we do keep upgrading our 10.4 series Thanks again for your time |