Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.5, 10.6, 10.10(EOL), 10.11, 11.1(EOL), 11.2(EOL), 11.3(EOL)
-
None
Description
While testing bb-10.6-thiru, InnoDB hangs during shutdown and it shows the
following information in error log file:
2023-10-10 6:11:53 0 [Note] Completing change buffer merge; 1 page reads initiated; 3 change buffer pages remain
|
2023-10-10 6:12:08 0 [Note] Completing change buffer merge; 1 page reads initiated; 3 change buffer pages remain
|
2023-10-10 6:12:23 0 [Note] Completing change buffer merge; 1 page reads initiated; 3 change buffer pages remain
|
2023-10-10 6:12:38 0 [Note] Completing change buffer merge; 1 page reads initiated; 3 change buffer pages remain
|
Analysis:
=========
During shutdown, InnoDB calls ibuf_merge_or_delete_for_page() for the problematic page (0, 739). But the desired bit for a given
page in the bitmap page is already set to IBUF_BITMAP_FREE. So we fail to remove the entry (0, 739) from change buffer index. So
jumped to understand ibuf_delete_recs() where the problematic page (0, 739) was deleted. During that, InnoDB change
buffer index has only root, leaf pages and there are no internal nodes.
mtr_t mtr;
|
loop:
|
btr_pcur_t pcur;
|
pcur.btr_cur.page_cur.index= ibuf.index;
|
ibuf_mtr_start(&mtr);
|
if (btr_pcur_open(&tuple, PAGE_CUR_GE, BTR_MODIFY_LEAF, &pcur, &mtr))
|
goto func_exit;
|
if (!btr_pcur_is_on_user_rec(&pcur))
|
{
|
ut_ad(btr_pcur_is_after_last_on_page(&pcur));
|
goto func_exit;
|
}
|
|
for (;;)
|
{
|
ut_ad(btr_pcur_is_on_user_rec(&pcur));
|
const rec_t* ibuf_rec = btr_pcur_get_rec(&pcur);
|
if (ibuf_rec_get_space(&mtr, ibuf_rec) != page_id.space()
|
|| ibuf_rec_get_page_no(&mtr, ibuf_rec) != page_id.page_no())
|
break;
|
/* Delete the record from ibuf */
|
if (ibuf_delete_rec(page_id, &pcur, &tuple, &mtr))
|
{
|
/* Deletion was pessimistic and mtr was committed:
|
we start from the beginning again */
|
ut_ad(mtr.has_committed());
|
goto loop;
|
}
|
|
if (btr_pcur_is_after_last_on_page(&pcur))
|
{
|
ibuf_mtr_commit(&mtr);
|
btr_pcur_close(&pcur);
|
goto loop;
|
}
|
btr_cur_open() searches with tuple only page_id (0, 739). Root page has this following (..(0, 563)(child 60), (0, 739)(child 63)..) since the mode is
PAGE_CUR_L for non-leaf node. It leads to child page 60 and deletes the record of (0, 739). once we reached the end of the page, again we do
open the change buffer index and we end up in page 60. Fail to find the record on page 60. ibuf_delete_recs() fails to delete the entries completely.
Even page 62 which is next to child page 60 has the record (0, 739)
Since change buffer index is in 5.5+ format, primary key for the index is
{space, 0, page_no, counter}. But we fail to use the counter field for searching the tuple.
Thanks to vlad.lesin for helping me in analysing this issue.
Attachments
Issue Links
- relates to
-
MDEV-33699 write statements gets stuck during IO-bound insert benchmark with InnoDB
- Open