[MDEV-24653] Assertion `block->page.id().page_no() == index->page' failed in innobase_instant_try Created: 2021-01-22 Updated: 2021-03-22 Resolved: 2021-01-25 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Definition - Alter Table, Storage Engine - InnoDB |
| Affects Version/s: | 10.3, 10.4, 10.5, 10.6 |
| Fix Version/s: | 10.3.28, 10.4.18, 10.5.9 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Roel Van de Paar | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | not-10.2, regression, rr-profile | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Description |
|
Possibly related to
Leads to:
Nothing relevant observed in ASAN/UBSAN. No obvious problems on optimized builds. |
| Comments |
| Comment by Roel Van de Paar [ 2021-01-22 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
Alternative testcase
| |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2021-01-22 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
rr trace available on rr box, based on alternative testcase
| |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-01-22 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
In the rr replay trace, I see that the table is empty except for the metadata record, but the leaf page is not the root page. The root page (number 3) contains a pointer to the empty child page (number 4). The assertion fails, because an empty table is assumed to consist of an empty leaf page. A similar false assumption was fixed as a part of | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2021-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
I tested first against bb-10.6- | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2021-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
Confirmed that the version used originally in this bug (9118fd360a3da0bba521caf2a35c424968235ac4) with the patch at the top of bb-10.6- | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2021-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
Tested the same on 10.5 (rev 139c85aafd4e4938f95843d44a455265a49b572e where the issue reproduces fine) with the same patch. Bug also disappears. Same tests executed. Patch thus works on 10.5 also. | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2021-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
Non-patched version 10.6.0 at rev 9118fd360a3da0bba521caf2a35c424968235ac4 (dbg, build 19-Jan-21) still shows the issue. It may be that issue reproducibility depends on the hardware used. | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2021-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
Non-patched version 10.6.0 at 0d7380fdac1add2be3fdb57519ccd8ac9c8e12bc (head of trunk, build today) still shows the issue. | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2021-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
TLDR; patch resolves issue on both 10.6 and 10.5. It is otherwise unclear as to why some non-patched mysqld's do not reproduce the issue. It seemed version, hardware or cmake options related, but nothing of those was conclusive, and the cause for this is likely elsewhere. | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-01-25 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
Roel, I created what I believe is a reproducible test case for this. The randomness that you experienced was probably related to the timing of the purge of history. I added an explicit wait for it.
Using this test case, I figured out why we end up with this situation. Because of innodb_limit_optimistic_insert_debug = 2, in the end we will only have the hidden ADD COLUMN metadata record in page 4. The right sibling of the page is deleted in purge:
At this point, we are missing the opportunity to raise the surviving sibling page up by one level, which would be the opposite of to the btr_page_split_and_insert() operation. I think that my original proposed fix is the lowest-risk resolution for this bug. Optimizing the B-tree space management (in the above stack trace) would be rather risky and best done in a development branch, not in GA versions. | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2021-01-25 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
Thank you marko. | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-01-25 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
I adjusted btr_pcur_store_position() as well. | |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-01-25 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
This fix is now merged up to the 10.6 branch. I filed MDEV-24673 for the failure to shrink the index tree. |