[MDEV-18377] Assertion `!recv_sys->mlog_checkpoint_lsn || *space == TRX_SYS_SPACE || srv_is_undo_tablespace(*space) || it != recv_spaces.end()' failed in recv_parse_log_rec upon prepare from a full backup Created: 2019-01-25  Updated: 2019-02-05  Resolved: 2019-02-04

Status: Closed
Project: MariaDB Server
Component/s: mariabackup, Storage Engine - InnoDB
Affects Version/s: N/A
Fix Version/s: 10.4.3, 10.2.22, 10.3.13

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: regression
Environment:

bb-10.2-MDEV-18194 de7408ec8adf78d7e5b6fd6a30feaa8b2f0dc350


Issue Links:
Blocks
blocks MDEV-12103 Reduce the time of looking for MLOG_C... Closed

 Description   

bb-10.2-MDEV-18194 de7408ec8adf78d7e5b6fd6a30feaa8b2f0dc350

mariabackup: /home/travis/src/storage/innobase/log/log0recv.cc:2412: ulint recv_parse_log_rec(mlog_id_t*, byte*, byte*, ulint*, ulint*, bool, byte**): Assertion `!recv_sys->mlog_checkpoint_lsn || *space == TRX_SYS_SPACE || srv_is_undo_tablespace(*space) || it != recv_spaces.end()' failed.
190125  6:03:42 [ERROR] mysqld got signal 6 ;
 
/lib/x86_64-linux-gnu/libc.so.6(+0x2dc82)[0x7fbc05815c82]
log/log0recv.cc:2412(recv_parse_log_rec(mlog_id_t*, unsigned char*, unsigned char*, unsigned long*, unsigned long*, bool, unsigned char**))[0x55e38a2dc8cb]
log/log0recv.cc:2696(recv_parse_log_recs(unsigned long, store_t, bool))[0x55e38a2dd515]
log/log0recv.cc:3100(recv_scan_log_recs(unsigned long, store_t*, unsigned char const*, unsigned long, unsigned long, unsigned long, unsigned long*, unsigned long*))[0x55e38a2de3db]
log/log0recv.cc:3195(recv_group_scan_log_recs(log_group_t*, unsigned long, unsigned long*, bool))[0x55e38a2de8db]
log/log0recv.cc:3519(recv_recovery_from_checkpoint_start(unsigned long))[0x55e38a2dfbd1]
srv/srv0start.cc:2192(innobase_start_or_create_for_mysql())[0x55e38a3ef14a]
mariabackup/xtrabackup.cc:1995(innodb_init())[0x55e389bb1827]
mariabackup/xtrabackup.cc:5595(xtrabackup_prepare_func(char**))[0x55e389bbbc4b]
mariabackup/xtrabackup.cc:6434(main_low(char**))[0x55e389bbdf45]
mariabackup/xtrabackup.cc:6227(main)[0x55e389bbd757]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fbc05808830]
/home/travis/server/bin/mariabackup(_start+0x29)[0x55e389bad069]

Backup data dirs and logs can be found here:
ftp://ftp.askmonty.org/public/mdev18377-dataset1.tar.gz
ftp://ftp.askmonty.org/public/mdev18377-dataset2.tar.gz

The first dataset is from the simpler full backup scenario, but it uses encryption, which is a bit of a complication. Check and update encryption options (paths) as necessary.

The second dataset is from the incremental backup scenario, even though it fails on the first prepare, performed on the full backup, before applying any incremental backups. It doesn't use encryption.


Notes on the origin of the report

Please note it's not necessarily reproducible the same way.

https://travis-ci.org/elenst/travis-tests/jobs/483980185
2574 16 3
Server: bb-10.2-MDEV-18194 de7408ec8adf78d7e5b6fd6a30feaa8b2f0dc350
Tests: elenst-dev 56fe47273ca5ef1a2672c5a105dd73bbb5d66766
Toolbox: 572674defcfef484b9188dde092511efe2db3800

https://travis-ci.org/elenst/travis-tests/jobs/483980187
2574 18 4
Server: bb-10.2-MDEV-18194 de7408ec8adf78d7e5b6fd6a30feaa8b2f0dc350
Tests: elenst-dev 56fe47273ca5ef1a2672c5a105dd73bbb5d66766
Toolbox: 572674defcfef484b9188dde092511efe2db3800



 Comments   
Comment by Thirunarayanan Balathandayuthapani [ 2019-01-25 ]

Basically assert was added to ensure that MLOG_FILE_NAME records was processed for the tablespace before parsing FSP_SIZE redo log records.
If there are multiple MLOG_CHECKPOINT records in the range between the latest checkpoint and the MLOG_CHECKPOINT of that latest checkpoint.
First scan only scans the last checkpoint MLOG_FILE_NAME records. It leaves out remaining MLOG_FILE_NAME records. It caused by adding
the end_lsn in checkpoint header. ( MDEV-12103). Basically we need to revert MDEV-12103 to ensure
all MLOG_FILE_NAME records are processed.

Even normal recovery could affect by the assert if we have multiple checkpoint happened between latest checkpoint and MLOG_CHECKPOINT of that
latest checkpoint.

Comment by Thirunarayanan Balathandayuthapani [ 2019-01-28 ]

Somehow i felt the assert is too strict. So wrote a test case to confirm it.

--source include/have_innodb.inc
--source include/have_debug.inc
 
CREATE TABLE t1 (f1 int not null)engine=innodb;
CREATE TABLE t2 (f1 int not null)engine=innodb;
insert into t2 values(1), (2), (3);
insert into t2 select * from t2;
insert into t2 select * from t2;
insert into t2 select * from t2;
select count(*) from t2;
 
insert into t1 values(1), (2), (3);
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
select count(*) from t1;
insert into t1 select * from t2;
insert into t1 values(1), (2), (3), (4), (5), (6), (7), (8), (9);
set global innodb_log_checkpoint_now = true;
--source ../include/no_checkpoint_start.inc
--echo # Below insert triggers the increase of the fsp size
insert into t1 values(1);
drop table t1;
--connection default
--let CLEANUP_IF_CHECKPOINT=DROP TABLE t1;
--source ../include/no_checkpoint_end.inc
--source include/start_mysqld.inc
--error 1146
select count(*) from t1;
drop table t2;

For the above test case, here is ib_log output:

recv_group_scan_log_recs: ib_log: scan 1708026: log rec MLOG_UNDO_INSERT len 16 page 0:334
recv_group_scan_log_recs: ib_log: Before setting recv size 1707260 space id 4
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:0
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_1BYTE len 7 page 4:0
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:0
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_INIT_FILE_PAGE2 len 3 page 4:6
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:2
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_PAGE_CREATE len 3 page 4:6
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_2BYTES len 6 page 4:6
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_8BYTES len 10 page 4:6
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_REC_INSERT len 24 page 4:3
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:5
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:6
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 10 page 4:6
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_REC_INSERT len 44 page 4:6
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_FILE_NAME len 19 page 4:0
recv_group_scan_log_recs: ib_log: scan 1708058: multi-log end total_len 157 n=15
.............
..............
..............
recv_group_scan_log_recs: ib_log: scan 1709495: multi-log end total_len 17 n=3
recv_group_scan_log_recs: ib_log: scan 1709512: log rec MLOG_UNDO_INSERT len 36 page 0:335
recv_group_scan_log_recs: ib_log: scan 1709548: log rec MLOG_REC_CLUST_DELETE_MARK len 21 page 0:307
recv_group_scan_log_recs: ib_log: scan 1709585: log rec MLOG_UNDO_INSERT len 36 page 0:335
recv_group_scan_log_recs: ib_log: scan 1709621: log rec MLOG_REC_CLUST_DELETE_MARK len 21 page 0:308
recv_group_scan_log_recs: ib_log: scan 1709642: log rec MLOG_FILE_DELETE len 19 page 4:0
.......
........
recv_group_scan_log_recs: ib_log: scan 1709753 completed
innobase_init: ib_log: checkpoint 6 at 1707251 written
innobase_init: ib_log: MLOG_CHECKPOINT(1707251) written at 1709762

For this test case, assert was failing. Because we need to set FSP_SIZE only during apply of multiple mini-transaction not during parsing of
multiple mtr records.

The below patch solves the issue:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index 01801dc..9108760 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -2403,7 +2403,8 @@ recv_parse_log_rec(
       }
 
       if (*page_no == 0 && *type == MLOG_4BYTES
-           && mach_read_from_2(old_ptr) == FSP_HEADER_OFFSET + FSP_SIZE) {
+           && mach_read_from_2(old_ptr) == FSP_HEADER_OFFSET + FSP_SIZE
+           && apply) {
               old_ptr += 2;
 
               ulint size = mach_parse_compressed(&old_ptr, end_ptr);

marko approved the patch. So pushed the patch along with mdev-18194 fix to test it.

Comment by Thirunarayanan Balathandayuthapani [ 2019-02-04 ]

Pushed as a part of mdev-18194

Generated at Thu Feb 08 08:43:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.