[MDEV-25979] Invalid undo page number written to DB_ROLL_PTR - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.3.5, 10.4.0, 10.5.0, 10.6.0
Fix Version/s: 10.3.31, 10.4.21, 10.5.12, 10.6.3
Component/s: Storage Engine - InnoDB
Labels:

Description

~~MDEV-15090~~ introduced a race condition in trx_undo_report_row_operation() that would be fixed by the following:

diff --git a/storage/innobase/trx/trx0rec.cc b/storage/innobase/trx/trx0rec.cc

index 9c7106facaf..e011b3f5d8e 100644

--- a/storage/innobase/trx/trx0rec.cc

+++ b/storage/innobase/trx/trx0rec.cc

@@ -2054,12 +2054,11 @@ trx_undo_report_row_operation(

 				goto err_exit;

-			mtr_commit(&mtr);

+			mtr.commit();

 		} else {

 			/* Success */

-			mtr_commit(&mtr);

 			undo->top_page_no = undo_block->page.id.page_no();

+			mtr.commit();

 			undo->top_offset  = offset;

 			undo->top_undo_no = trx->undo_no++;

 			undo->guess_block = undo_block;

This could cause various types of failure. In the trace that I analyzed most recently, the page had been evicted from the buffer pool by the LRU mechanism but not yet replaced with another page. Thus, the undo->top_page_no had been assigned to FIL_NULL, triggering an error message later when the DB_ROLL_PTR column in a B-tree record was dereferenced:

2021-06-17  7:10:52 17 [ERROR] [FATAL] InnoDB: Trying to read 4096 bytes at 17592186040320 outside the bounds of the file: .//undo001

Note: the reported byte offset is 4096*0xffffffff.

In another trace that I started analyzing earlier (with innodb_file_per_table=0 and no separate undo tablespaces), the error was something different. I suspect that the page had not only been evicted but the block descriptor started to point to a B-tree page. Thus, the DB_ROLL_PTR would wrongly point to a B-tree page instead of the undo log page that had just been written. Ultimately, an assertion failure would be triggered for attempting to interpret some bytes in the middle of a B-tree page as an undo log record.

I think that it should be relatively hard to hit this bug in practice. It would likely require using an extremely small buffer pool, so that page eviction or replacement could take place between those 2 statements.

Attachments

Issue Links

is caused by

MDEV-15090 Reduce the overhead of writing undo log records

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä created issue - 2021-06-21 15:00

Marko Mäkelä made changes - 2021-06-21 15:10

Field	Original Value	New Value
Link		This issue is caused by ~~MDEV-15090~~ [ ~~MDEV-15090~~ ]

Marko Mäkelä made changes - 2021-06-21 15:10

Fix Version/s		10.3 [ 22126 ]
Fix Version/s		10.4 [ 22408 ]
Fix Version/s		10.5 [ 23123 ]
Fix Version/s		10.6 [ 24028 ]
Affects Version/s		10.6.0 [ 24431 ]
Affects Version/s		10.5.0 [ 23709 ]
Affects Version/s		10.4.0 [ 23115 ]
Affects Version/s	10.4 [ 22408 ]
Assignee		Marko Mäkelä [ marko ]
Description		~~MDEV-15090~~ introduced a race condition in {{trx_undo_report_row_operation()}} that would be fixed by the following: {code:diff} diff --git a/storage/innobase/trx/trx0rec.cc b/storage/innobase/trx/trx0rec.cc index 9c7106facaf..e011b3f5d8e 100644 --- a/storage/innobase/trx/trx0rec.cc +++ b/storage/innobase/trx/trx0rec.cc @@ -2054,12 +2054,11 @@ trx_undo_report_row_operation( goto err_exit; } - mtr_commit(&mtr); + mtr.commit(); } else { /* Success / - mtr_commit(&mtr); - undo->top_page_no = undo_block->page.id.page_no(); + mtr.commit(); undo->top_offset = offset; undo->top_undo_no = trx->undo_no++; undo->guess_block = undo_block; {code} This could cause various types of failure. In the trace that I analyzed most recently, the page had been evicted from the buffer pool by the LRU mechanism but not yet replaced with another page. Thus, the {{undo->top_page_no}} had been assigned to {{FIL_NULL}}, triggering an error message later when the {{DB_ROLL_PTR}} column in a B-tree record was dereferenced: {noformat} 2021-06-17 7:10:52 17 [ERROR] [FATAL] InnoDB: Trying to read 4096 bytes at 17592186040320 outside the bounds of the file: .//undo001 {noformat} Note: the reported byte offset is 40960xffffffff. In another trace that I started analyzing earlier (with {{innodb_file_per_table=0}} and no separate undo tablespaces), the error was something different. I suspect that the page had not only been evicted but the block descriptor started to point to a B-tree page. Thus, the {{DB_ROLL_PTR}} would wrongly point to a B-tree page instead of the undo log page that had just been written. Ultimately, an assertion failure would be triggered for attempting to interpret some bytes in the middle of a B-tree page as an undo log record. I think that it should be relatively hard to hit this bug in practice. It would likely require using an extremely small buffer pool, so that page eviction or replacement could take place between those 2 statements.
Labels		corruption race regression-10.3 rr-profile-analyzed
Priority	Major [ 3 ]	Critical [ 2 ]

Marko Mäkelä added a comment - 2021-06-21 16:41

I confirmed last week’s mystery to be another case of this race condition:

ssh pluto

rr replay /data/Results/1623843754/TBR-1118/dev/shm/vardir/1623843754/161/1/rr/latest-trace

(rr) display/x $25.page.id_

4: /x $25.page.id_ = {m_id = 0x896}

(rr) cond 1 request.bpage.id_.m_id==0x896

(rr) c

Continuing.

2021-06-16 12:03:04 34 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction

[Switching to Thread 10382.16272]

Thread 17 hit Breakpoint 1, buf_page_write_complete (request=...) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/buf/buf0flu.cc:325

325	{

4: /x $25.page.id_ = {m_id = 0x896}

(rr) thr a 36 bt

Thread 36 (Thread 10382.30267):

…

#13 fil_space_t::x_unlock (this=0x6120000229c0) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/include/fil0fil.h:1033

#14 ReleaseLatches::operator() (slot=0x1bfa33a40c60, this=0x1bfa33a400b0) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/mtr/mtr0mtr.cc:260

#15 CIterate<ReleaseLatches>::operator() (block=<optimized out>, this=0x1bfa33a400b0) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/mtr/mtr0mtr.cc:61

#16 mtr_buf_t::for_each_block_in_reverse<CIterate<ReleaseLatches> > (this=this@entry=0x1bfa33a40c08, functor=...) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/include/dyn0buf.h:379

#17 0x000055bfa90dcfd2 in mtr_t::commit (this=this@entry=0x1bfa33a40bf0) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/mtr/mtr0mtr.cc:42

#18 0x000055bfa93eeb3b in trx_undo_report_row_operation (thr=thr@entry=0x6250007b0af8, index=index@entry=0x616002ed0208, clust_entry=clust_entry@entry=0x6220000d09a0, update=update@entry=0x0, cmpl_info=cmpl_info@entry=0, rec=rec@entry=0x3881559ca1b6 "\200", offsets=<optimized out>, roll_ptr=<optimized out>) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/trx/trx0rec.cc:2147

That is, the undo log page that this thread had just been written is being evicted from the buffer pool. Before we copied the page number from the page descriptor, that page descriptor had been reassigned to the B-tree page 0x884. Finally, we would fail like this, because there was no valid undo log record in the middle of the B-tree page:

mysqld: /data/Server/bb-10.6-MDEV-25062F/storage/innobase/trx/trx0rec.cc:503: byte* trx_undo_rec_get_pars(trx_undo_rec_t*, ulint*, ulint*, bool*, undo_no_t*, table_id_t*): Assertion `*type >= TRX_UNDO_RENAME_TABLE' failed.

Marko Mäkelä added a comment - 2021-06-21 16:41 I confirmed last week’s mystery to be another case of this race condition: ssh pluto rr replay /data/Results/1623843754/TBR-1118/dev/shm/vardir/1623843754/161/1/rr/latest-trace (rr) display/x $25.page.id_ 4: /x $25.page.id_ = {m_id = 0x896} (rr) cond 1 request.bpage.id_.m_id==0x896 (rr) c Continuing. 2021-06-16 12:03:04 34 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction [Switching to Thread 10382.16272] Thread 17 hit Breakpoint 1, buf_page_write_complete (request=...) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/buf/buf0flu.cc:325 325 { 4: /x $25.page.id_ = {m_id = 0x896} (rr) thr a 36 bt Thread 36 (Thread 10382.30267): … #13 fil_space_t::x_unlock (this=0x6120000229c0) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/include/fil0fil.h:1033 #14 ReleaseLatches::operator() (slot=0x1bfa33a40c60, this=0x1bfa33a400b0) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/mtr/mtr0mtr.cc:260 #15 CIterate<ReleaseLatches>::operator() (block=<optimized out>, this=0x1bfa33a400b0) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/mtr/mtr0mtr.cc:61 #16 mtr_buf_t::for_each_block_in_reverse<CIterate<ReleaseLatches> > (this=this@entry=0x1bfa33a40c08, functor=...) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/include/dyn0buf.h:379 #17 0x000055bfa90dcfd2 in mtr_t::commit (this=this@entry=0x1bfa33a40bf0) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/mtr/mtr0mtr.cc:42 #18 0x000055bfa93eeb3b in trx_undo_report_row_operation (thr=thr@entry=0x6250007b0af8, index=index@entry=0x616002ed0208, clust_entry=clust_entry@entry=0x6220000d09a0, update=update@entry=0x0, cmpl_info=cmpl_info@entry=0, rec=rec@entry=0x3881559ca1b6 "\200", offsets=<optimized out>, roll_ptr=<optimized out>) at /data/Server/bb-10.6-MDEV-25062F/storage/innobase/trx/trx0rec.cc:2147 That is, the undo log page that this thread had just been written is being evicted from the buffer pool. Before we copied the page number from the page descriptor, that page descriptor had been reassigned to the B-tree page 0x884. Finally, we would fail like this, because there was no valid undo log record in the middle of the B-tree page: mysqld: /data/Server/bb-10.6-MDEV-25062F/storage/innobase/trx/trx0rec.cc:503: byte* trx_undo_rec_get_pars(trx_undo_rec_t*, ulint*, ulint*, bool*, undo_no_t*, table_id_t*): Assertion `*type >= TRX_UNDO_RENAME_TABLE' failed.

Marko Mäkelä made changes - 2021-06-21 16:44

issue.field.resolutiondate

2021-06-21 16:44:19.0

2021-06-21 16:44:19.907

Marko Mäkelä made changes - 2021-06-21 16:44

Fix Version/s		10.6.3 [ 25904 ]
Fix Version/s		10.3.30 [ 25732 ]
Fix Version/s		10.4.20 [ 25733 ]
Fix Version/s		10.5.11 [ 25734 ]
Fix Version/s	10.3 [ 22126 ]
Fix Version/s	10.4 [ 22408 ]
Fix Version/s	10.5 [ 23123 ]
Fix Version/s	10.6 [ 24028 ]
Resolution		Fixed [ 1 ]
Status	Open [ 1 ]	Closed [ 6 ]

Marko Mäkelä made changes - 2021-06-22 07:46

Fix Version/s		10.3.31 [ 26028 ]
Fix Version/s		10.4.21 [ 26030 ]
Fix Version/s		10.5.12 [ 26025 ]
Fix Version/s	10.3.30 [ 25732 ]
Fix Version/s	10.4.20 [ 25733 ]
Fix Version/s	10.5.11 [ 25734 ]

Sergei Golubchik made changes - 2021-12-06 21:53

Workflow

MariaDB v3 [ 122887 ]

MariaDB v4 [ 159424 ]

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2021-06-21 15:00

Updated:: 2023-12-06 20:06

Resolved:: 2021-06-21 16:44

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server