[MDEV-13228] Assertion `n < rec_offs_n_fields(offsets)' failed in rec_get_nth_field_offs upon crash recovery with compressed table Created: 2017-07-01  Updated: 2017-07-03  Resolved: 2017-07-03

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2.7
Fix Version/s: 10.2.7

Type: Bug Priority: Critical
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: regression


 Description   

Note: It might be just a bad assertion failure, as nothing obviously bad seems to be happening on non-debug build, but it needs to be checked at least.

--source include/have_innodb.inc
 
CREATE TABLE t1 (pk INT PRIMARY KEY, i INT) ENGINE=InnoDB ROW_FORMAT=COMPRESSED;
INSERT INTO t1 VALUES  (1, 10);
UPDATE t1 SET pk = 2 WHERE pk = 1;
 
--let $shutdown_timeout= 0
--source include/restart_mysqld.inc

10.2 92f1837a27f4b78a3e6c74ca33c3052211069af5

2017-07-01 19:39:53 139914949132352 [Note] InnoDB: Highest supported file format is Barracuda.
2017-07-01 19:39:53 139914949132352 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1633883
2017-07-01 19:39:53 139914949132352 [Note] InnoDB: Starting final batch to recover 12 pages from redo log.
mysqld: /data/src/10.2/storage/innobase/include/rem0rec.ic:1078: ulint rec_get_nth_field_offs(const ulint*, ulint, ulint*): Assertion `n < rec_offs_n_fields(offsets)' failed.
170701 19:39:53 [ERROR] mysqld got signal 6 ;
 
#7  0x00007f407aa19ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x0000556a9a6f2666 in rec_get_nth_field_offs (offsets=0x7f406effbfb0, n=2, len=0x7f406effbeb0) at /data/src/10.2/storage/innobase/include/rem0rec.ic:1078
#9  0x0000556a9a6fcb02 in page_zip_write_trx_id_and_roll_ptr (page_zip=0x7f4074143148, rec=0x7f407442007e "\200", offsets=0x7f406effbfb0, trx_id_col=1, trx_id=1286, roll_ptr=1125899928207632) at /data/src/10.2/storage/innobase/page/page0zip.cc:4185
#10 0x0000556a9a7b0f08 in row_upd_rec_sys_fields_in_recovery (rec=0x7f407442007e "\200", page_zip=0x7f4074143148, offsets=0x7f406effbfb0, pos=1, trx_id=1286, roll_ptr=1125899928207632) at /data/src/10.2/storage/innobase/row/row0upd.cc:467
#11 0x0000556a9a86243a in btr_cur_parse_del_mark_set_clust_rec (ptr=0x7f407418ceb5 "", end_ptr=0x7f407418ceb5 "", page=0x7f4074420000 "", page_zip=0x7f4074143148, index=0x7f4058001768) at /data/src/10.2/storage/innobase/btr/btr0cur.cc:4613
#12 0x0000556a9a6b60d4 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_CLUST_DELETE_MARK, ptr=0x7f407418cea4 "", end_ptr=0x7f407418ceb5 "", space_id=4, page_no=3, apply=true, block=0x7f4074143120, mtr=0x7f406effc590) at /data/src/10.2/storage/innobase/log/log0recv.cc:1272
#13 0x0000556a9a6b7b36 in recv_recover_page (just_read_in=true, block=0x7f4074143120) at /data/src/10.2/storage/innobase/log/log0recv.cc:1814
#14 0x0000556a9a89656e in buf_page_io_complete (bpage=0x7f4074143120, evict=false) at /data/src/10.2/storage/innobase/buf/buf0buf.cc:6021
#15 0x0000556a9a922b58 in fil_aio_wait (segment=2) at /data/src/10.2/storage/innobase/fil/fil0fil.cc:5481
#16 0x0000556a9a7cc90b in io_handler_thread (arg=0x556a9be53b90 <n+16>) at /data/src/10.2/storage/innobase/srv/srv0start.cc:343
#17 0x00007f407c95e494 in start_thread (arg=0x7f406effd700) at pthread_create.c:333
#18 0x00007f407aad693f in clone () from /lib/x86_64-linux-gnu/libc.so.6

The problem appeared in 10.2 tree with this revision:

commit c436338d9d535aac7692d27ec1dc068e9ce6c9a9 53235cbb1f0d41654935563633a5c61a2caa9495
Author: Marko Mäkelä <marko.makela@mariadb.com>
Date:   Fri Jun 30 18:51:51 2017 +0300
 
    Assert that DB_TRX_ID must be set on delete-marked records



 Comments   
Comment by Elena Stepanova [ 2017-07-01 ]

ATTN jplindst

Comment by Marko Mäkelä [ 2017-07-03 ]

Sorry, this was a simple off-by-one error. I think that it should affect debug builds only, because the out-of-bounds access occurred inside a debug assertion:

	ut_ad(field + DATA_TRX_ID_LEN
	      == rec_get_nth_field(rec, offsets, trx_id_col + 1, &len));

The fix is simple:

diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc
index a672f451ea9..14b24bcd9fd 100644
--- a/storage/innobase/btr/btr0cur.cc
+++ b/storage/innobase/btr/btr0cur.cc
@@ -4601,7 +4601,7 @@ btr_cur_parse_del_mark_set_clust_rec(
 		btr_rec_set_deleted_flag(rec, page_zip, val);
 		ut_ad(pos <= MAX_REF_PARTS);
 
-		ulint offsets[REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 2];
+		ulint offsets[REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 3];
 		rec_offs_init(offsets);
 		mem_heap_t*	heap	= NULL;
 
@@ -4609,7 +4609,7 @@ btr_cur_parse_del_mark_set_clust_rec(
 			row_upd_rec_sys_fields_in_recovery(
 				rec, page_zip,
 				rec_get_offsets(rec, index, offsets,
-						pos + 1, &heap),
+						pos + 2, &heap),
 				pos, trx_id, roll_ptr);
 		} else {
 			/* In delete-marked records, DB_TRX_ID must

Comment by Jan Lindström (Inactive) [ 2017-07-03 ]

It would be good if you could add comment why +3 is needed (e.g. in rem/rem0rec.cc: ulint offsets_[REC_OFFS_HEADER_SIZE + MAX_REF_PARTS + 2] and why pos + 2 .

Comment by Marko Mäkelä [ 2017-07-03 ]

Sorry, I already pushed the patch. Initializing the first pos+2 elements of the offsets allows us to access the clustered index fields DB_TRX_ID (pos) and DB_ROLL_PTR (pos+1).
On a second thought, the memory array size was sufficiently large for MAX_REF_PARTS+2 index fields.

Generated at Thu Feb 08 08:03:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.