[MDEV-30009] InnoDB shutdown hangs when the change buffer is corrupted Created: 2022-11-14 Updated: 2023-09-08 Resolved: 2022-11-23 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11 |
| Fix Version/s: | 10.11.2, 10.5.19, 10.6.12, 10.7.8, 10.8.7, 10.9.5, 10.10.3 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | regression-10.5 | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
It looks like It turns out that as part of fixing
With that, we will observe a hang. The following includes the progress reporting of
On 10.5, this modified test will cause a crash when the purge of transaction history is attempting to access a corrupted page, which was intentionally corrupted by the test. I believe that the case that made me file |
| Comments |
| Comment by Thirunarayanan Balathandayuthapani [ 2022-11-15 ] | |
|
Patch is in bb-10.6- | |
| Comment by Marko Mäkelä [ 2022-11-16 ] | |
|
Thank you, this is a step to the right direction. I think that we need to enable the logic also in release builds. Furthermore, I think that we must ensure that the purge of transaction history will not add more entries to the change buffer during a slow shutdown. | |
| Comment by Marko Mäkelä [ 2022-11-23 ] | |
|
In an rr replay trace that was produced with innodb_change_buffering_debug=1, it turns out that we must remove the change buffer records in the if (!bitmap_bits) code path of ibuf_merge_or_delete_for_page(), to avoid causing corruption. We got an rr replay trace where changes had been buffered for a 3-field index, the index was dropped, created as a 2-field index and stored in the same page. During ibuf_read_merge_pages(), we reset the bitmap bits but skipped the call to ibuf_delete_recs(). Finally, during a change buffer merge, we attempted to merge bogus 3-field records to the 2-field index page and hit an assertion in page_validate() when misinterpreting the records of the new 2-field index page as 3-field ones:
This corruption occurred on an .ibd file. While ALTER TABLE…DROP/ADD INDEX was involved in this case, I believe that the corruption is possible even for DML, when lots of records are being deleted. In To avoid reintroducing the corruption | |
| Comment by Marko Mäkelä [ 2023-02-17 ] | |
|
This bug was featured in https://fosdem.org/2023/schedule/event/innodb_change_buffer/ |