[MDEV-13228] Assertion `n < rec_offs_n_fields(offsets)' failed in rec_get_nth_field_offs upon crash recovery with compressed table - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.2.7
Fix Version/s: 10.2.7
Component/s: Storage Engine - InnoDB
Labels:
- regression

Description

Note: It might be just a bad assertion failure, as nothing obviously bad seems to be happening on non-debug build, but it needs to be checked at least.

--source include/have_innodb.inc

CREATE TABLE t1 (pk INT PRIMARY KEY, i INT) ENGINE=InnoDB ROW_FORMAT=COMPRESSED;

INSERT INTO t1 VALUES  (1, 10);

UPDATE t1 SET pk = 2 WHERE pk = 1;

--let $shutdown_timeout= 0

--source include/restart_mysqld.inc

10.2 92f1837a27f4b78a3e6c74ca33c3052211069af5
2017-07-01 19:39:53 139914949132352 [Note] InnoDB: Highest supported file format is Barracuda.
2017-07-01 19:39:53 139914949132352 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1633883
2017-07-01 19:39:53 139914949132352 [Note] InnoDB: Starting final batch to recover 12 pages from redo log.
mysqld: /data/src/10.2/storage/innobase/include/rem0rec.ic:1078: ulint rec_get_nth_field_offs(const ulint, ulint, ulint): Assertion `n < rec_offs_n_fields(offsets)' failed.
170701 19:39:53 [ERROR] mysqld got signal 6 ;

#7 0x00007f407aa19ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#8 0x0000556a9a6f2666 in rec_get_nth_field_offs (offsets=0x7f406effbfb0, n=2, len=0x7f406effbeb0) at /data/src/10.2/storage/innobase/include/rem0rec.ic:1078
#9 0x0000556a9a6fcb02 in page_zip_write_trx_id_and_roll_ptr (page_zip=0x7f4074143148, rec=0x7f407442007e "\200", offsets=0x7f406effbfb0, trx_id_col=1, trx_id=1286, roll_ptr=1125899928207632) at /data/src/10.2/storage/innobase/page/page0zip.cc:4185
#10 0x0000556a9a7b0f08 in row_upd_rec_sys_fields_in_recovery (rec=0x7f407442007e "\200", page_zip=0x7f4074143148, offsets=0x7f406effbfb0, pos=1, trx_id=1286, roll_ptr=1125899928207632) at /data/src/10.2/storage/innobase/row/row0upd.cc:467
#11 0x0000556a9a86243a in btr_cur_parse_del_mark_set_clust_rec (ptr=0x7f407418ceb5 "", end_ptr=0x7f407418ceb5 "", page=0x7f4074420000 "", page_zip=0x7f4074143148, index=0x7f4058001768) at /data/src/10.2/storage/innobase/btr/btr0cur.cc:4613
#12 0x0000556a9a6b60d4 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_CLUST_DELETE_MARK, ptr=0x7f407418cea4 "", end_ptr=0x7f407418ceb5 "", space_id=4, page_no=3, apply=true, block=0x7f4074143120, mtr=0x7f406effc590) at /data/src/10.2/storage/innobase/log/log0recv.cc:1272
#13 0x0000556a9a6b7b36 in recv_recover_page (just_read_in=true, block=0x7f4074143120) at /data/src/10.2/storage/innobase/log/log0recv.cc:1814
#14 0x0000556a9a89656e in buf_page_io_complete (bpage=0x7f4074143120, evict=false) at /data/src/10.2/storage/innobase/buf/buf0buf.cc:6021
#15 0x0000556a9a922b58 in fil_aio_wait (segment=2) at /data/src/10.2/storage/innobase/fil/fil0fil.cc:5481
#16 0x0000556a9a7cc90b in io_handler_thread (arg=0x556a9be53b90 <n+16>) at /data/src/10.2/storage/innobase/srv/srv0start.cc:343
#17 0x00007f407c95e494 in start_thread (arg=0x7f406effd700) at pthread_create.c:333
#18 0x00007f407aad693f in clone () from /lib/x86_64-linux-gnu/libc.so.6

The problem appeared in 10.2 tree with this revision:

commit c436338d9d535aac7692d27ec1dc068e9ce6c9a9 53235cbb1f0d41654935563633a5c61a2caa9495

Author: Marko Mäkelä <marko.makela@mariadb.com>

Date:   Fri Jun 30 18:51:51 2017 +0300

    Assert that DB_TRX_ID must be set on delete-marked records

Attachments

Activity

Ascending order - Click to sort in descending order

Elena Stepanova added a comment - 2017-07-01 16:49

ATTN jplindst

Elena Stepanova added a comment - 2017-07-01 16:49 ATTN jplindst

Marko Mäkelä added a comment - 2017-07-03 09:10

Sorry, this was a simple off-by-one error. I think that it should affect debug builds only, because the out-of-bounds access occurred inside a debug assertion:

	ut_ad(field + DATA_TRX_ID_LEN

	      == rec_get_nth_field(rec, offsets, trx_id_col + 1, &len));

The fix is simple:

diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc

index a672f451ea9..14b24bcd9fd 100644

--- a/storage/innobase/btr/btr0cur.cc

+++ b/storage/innobase/btr/btr0cur.cc

@@ -4601,7 +4601,7 @@ btr_cur_parse_del_mark_set_clust_rec(

 		btr_rec_set_deleted_flag(rec, page_zip, val);

 		ut_ad(pos <= MAX_REF_PARTS);

-		ulint offsets[REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 2];

+		ulint offsets[REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 3];

 		rec_offs_init(offsets);

 		mem_heap_t*	heap	= NULL;

@@ -4609,7 +4609,7 @@ btr_cur_parse_del_mark_set_clust_rec(

 			row_upd_rec_sys_fields_in_recovery(

 				rec, page_zip,

 				rec_get_offsets(rec, index, offsets,

-						pos + 1, &heap),

+						pos + 2, &heap),

 				pos, trx_id, roll_ptr);

 		} else {

 			/* In delete-marked records, DB_TRX_ID must

Marko Mäkelä added a comment - 2017-07-03 09:10 Sorry, this was a simple off-by-one error. I think that it should affect debug builds only, because the out-of-bounds access occurred inside a debug assertion: ut_ad(field + DATA_TRX_ID_LEN == rec_get_nth_field(rec, offsets, trx_id_col + 1, &len)); The fix is simple: diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc index a672f451ea9..14b24bcd9fd 100644 --- a/storage/innobase/btr/btr0cur.cc +++ b/storage/innobase/btr/btr0cur.cc @@ -4601,7 +4601,7 @@ btr_cur_parse_del_mark_set_clust_rec( btr_rec_set_deleted_flag(rec, page_zip, val); ut_ad(pos <= MAX_REF_PARTS); - ulint offsets[REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 2]; + ulint offsets[REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 3]; rec_offs_init(offsets); mem_heap_t* heap = NULL; @@ -4609,7 +4609,7 @@ btr_cur_parse_del_mark_set_clust_rec( row_upd_rec_sys_fields_in_recovery( rec, page_zip, rec_get_offsets(rec, index, offsets, - pos + 1, &heap), + pos + 2, &heap), pos, trx_id, roll_ptr); } else { /* In delete-marked records, DB_TRX_ID must

Jan Lindström (Inactive) added a comment - 2017-07-03 09:22

It would be good if you could add comment why +3 is needed (e.g. in rem/rem0rec.cc: ulint offsets_[REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 2] and why pos + 2 .

Jan Lindström (Inactive) added a comment - 2017-07-03 09:22 It would be good if you could add comment why +3 is needed (e.g. in rem/rem0rec.cc: ulint offsets_ [REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 2] and why pos + 2 .

Marko Mäkelä added a comment - 2017-07-03 10:04

Sorry, I already pushed the patch. Initializing the first pos+2 elements of the offsets allows us to access the clustered index fields DB_TRX_ID (pos) and DB_ROLL_PTR (pos+1).
On a second thought, the memory array size was sufficiently large for MAX_REF_PARTS+2 index fields.

Marko Mäkelä added a comment - 2017-07-03 10:04 Sorry, I already pushed the patch. Initializing the first pos+2 elements of the offsets allows us to access the clustered index fields DB_TRX_ID (pos) and DB_ROLL_PTR (pos+1). On a second thought, the memory array size was sufficiently large for MAX_REF_PARTS+2 index fields.

MariaDB Server

Assertion `n < rec_offs_n_fields(offsets)' failed in rec_get_nth_field_offs upon crash recovery with compressed table

Details

Description

Attachments

Activity

People

Dates

Git Integration