Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18377

Assertion `!recv_sys->mlog_checkpoint_lsn || *space == TRX_SYS_SPACE || srv_is_undo_tablespace(*space) || it != recv_spaces.end()' failed in recv_parse_log_rec upon prepare from a full backup

Details

    Description

      bb-10.2-MDEV-18194 de7408ec8adf78d7e5b6fd6a30feaa8b2f0dc350

      mariabackup: /home/travis/src/storage/innobase/log/log0recv.cc:2412: ulint recv_parse_log_rec(mlog_id_t*, byte*, byte*, ulint*, ulint*, bool, byte**): Assertion `!recv_sys->mlog_checkpoint_lsn || *space == TRX_SYS_SPACE || srv_is_undo_tablespace(*space) || it != recv_spaces.end()' failed.
      190125  6:03:42 [ERROR] mysqld got signal 6 ;
       
      /lib/x86_64-linux-gnu/libc.so.6(+0x2dc82)[0x7fbc05815c82]
      log/log0recv.cc:2412(recv_parse_log_rec(mlog_id_t*, unsigned char*, unsigned char*, unsigned long*, unsigned long*, bool, unsigned char**))[0x55e38a2dc8cb]
      log/log0recv.cc:2696(recv_parse_log_recs(unsigned long, store_t, bool))[0x55e38a2dd515]
      log/log0recv.cc:3100(recv_scan_log_recs(unsigned long, store_t*, unsigned char const*, unsigned long, unsigned long, unsigned long, unsigned long*, unsigned long*))[0x55e38a2de3db]
      log/log0recv.cc:3195(recv_group_scan_log_recs(log_group_t*, unsigned long, unsigned long*, bool))[0x55e38a2de8db]
      log/log0recv.cc:3519(recv_recovery_from_checkpoint_start(unsigned long))[0x55e38a2dfbd1]
      srv/srv0start.cc:2192(innobase_start_or_create_for_mysql())[0x55e38a3ef14a]
      mariabackup/xtrabackup.cc:1995(innodb_init())[0x55e389bb1827]
      mariabackup/xtrabackup.cc:5595(xtrabackup_prepare_func(char**))[0x55e389bbbc4b]
      mariabackup/xtrabackup.cc:6434(main_low(char**))[0x55e389bbdf45]
      mariabackup/xtrabackup.cc:6227(main)[0x55e389bbd757]
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fbc05808830]
      /home/travis/server/bin/mariabackup(_start+0x29)[0x55e389bad069]
      

      Backup data dirs and logs can be found here:
      ftp://ftp.askmonty.org/public/mdev18377-dataset1.tar.gz
      ftp://ftp.askmonty.org/public/mdev18377-dataset2.tar.gz

      The first dataset is from the simpler full backup scenario, but it uses encryption, which is a bit of a complication. Check and update encryption options (paths) as necessary.

      The second dataset is from the incremental backup scenario, even though it fails on the first prepare, performed on the full backup, before applying any incremental backups. It doesn't use encryption.


      Notes on the origin of the report

      Please note it's not necessarily reproducible the same way.

      https://travis-ci.org/elenst/travis-tests/jobs/483980185
      2574 16 3
      Server: bb-10.2-MDEV-18194 de7408ec8adf78d7e5b6fd6a30feaa8b2f0dc350
      Tests: elenst-dev 56fe47273ca5ef1a2672c5a105dd73bbb5d66766
      Toolbox: 572674defcfef484b9188dde092511efe2db3800

      https://travis-ci.org/elenst/travis-tests/jobs/483980187
      2574 18 4
      Server: bb-10.2-MDEV-18194 de7408ec8adf78d7e5b6fd6a30feaa8b2f0dc350
      Tests: elenst-dev 56fe47273ca5ef1a2672c5a105dd73bbb5d66766
      Toolbox: 572674defcfef484b9188dde092511efe2db3800

      Attachments

        Issue Links

          Activity

            Basically assert was added to ensure that MLOG_FILE_NAME records was processed for the tablespace before parsing FSP_SIZE redo log records.
            If there are multiple MLOG_CHECKPOINT records in the range between the latest checkpoint and the MLOG_CHECKPOINT of that latest checkpoint.
            First scan only scans the last checkpoint MLOG_FILE_NAME records. It leaves out remaining MLOG_FILE_NAME records. It caused by adding
            the end_lsn in checkpoint header. ( MDEV-12103). Basically we need to revert MDEV-12103 to ensure
            all MLOG_FILE_NAME records are processed.

            Even normal recovery could affect by the assert if we have multiple checkpoint happened between latest checkpoint and MLOG_CHECKPOINT of that
            latest checkpoint.

            thiru Thirunarayanan Balathandayuthapani added a comment - Basically assert was added to ensure that MLOG_FILE_NAME records was processed for the tablespace before parsing FSP_SIZE redo log records. If there are multiple MLOG_CHECKPOINT records in the range between the latest checkpoint and the MLOG_CHECKPOINT of that latest checkpoint. First scan only scans the last checkpoint MLOG_FILE_NAME records. It leaves out remaining MLOG_FILE_NAME records. It caused by adding the end_lsn in checkpoint header. ( MDEV-12103 ). Basically we need to revert MDEV-12103 to ensure all MLOG_FILE_NAME records are processed. Even normal recovery could affect by the assert if we have multiple checkpoint happened between latest checkpoint and MLOG_CHECKPOINT of that latest checkpoint.

            Somehow i felt the assert is too strict. So wrote a test case to confirm it.

            --source include/have_innodb.inc
            --source include/have_debug.inc
             
            CREATE TABLE t1 (f1 int not null)engine=innodb;
            CREATE TABLE t2 (f1 int not null)engine=innodb;
            insert into t2 values(1), (2), (3);
            insert into t2 select * from t2;
            insert into t2 select * from t2;
            insert into t2 select * from t2;
            select count(*) from t2;
             
            insert into t1 values(1), (2), (3);
            insert into t1 select * from t1;
            insert into t1 select * from t1;
            insert into t1 select * from t1;
            insert into t1 select * from t1;
            insert into t1 select * from t1;
            insert into t1 select * from t1;
            insert into t1 select * from t1;
            insert into t1 select * from t1;
            select count(*) from t1;
            insert into t1 select * from t2;
            insert into t1 values(1), (2), (3), (4), (5), (6), (7), (8), (9);
            set global innodb_log_checkpoint_now = true;
            --source ../include/no_checkpoint_start.inc
            --echo # Below insert triggers the increase of the fsp size
            insert into t1 values(1);
            drop table t1;
            --connection default
            --let CLEANUP_IF_CHECKPOINT=DROP TABLE t1;
            --source ../include/no_checkpoint_end.inc
            --source include/start_mysqld.inc
            --error 1146
            select count(*) from t1;
            drop table t2;
            

            For the above test case, here is ib_log output:

            recv_group_scan_log_recs: ib_log: scan 1708026: log rec MLOG_UNDO_INSERT len 16 page 0:334
            recv_group_scan_log_recs: ib_log: Before setting recv size 1707260 space id 4
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:0
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_1BYTE len 7 page 4:0
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:0
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_INIT_FILE_PAGE2 len 3 page 4:6
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:2
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_PAGE_CREATE len 3 page 4:6
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_2BYTES len 6 page 4:6
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_8BYTES len 10 page 4:6
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_REC_INSERT len 24 page 4:3
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:5
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:6
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 10 page 4:6
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_REC_INSERT len 44 page 4:6
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_FILE_NAME len 19 page 4:0
            recv_group_scan_log_recs: ib_log: scan 1708058: multi-log end total_len 157 n=15
            .............
            ..............
            ..............
            recv_group_scan_log_recs: ib_log: scan 1709495: multi-log end total_len 17 n=3
            recv_group_scan_log_recs: ib_log: scan 1709512: log rec MLOG_UNDO_INSERT len 36 page 0:335
            recv_group_scan_log_recs: ib_log: scan 1709548: log rec MLOG_REC_CLUST_DELETE_MARK len 21 page 0:307
            recv_group_scan_log_recs: ib_log: scan 1709585: log rec MLOG_UNDO_INSERT len 36 page 0:335
            recv_group_scan_log_recs: ib_log: scan 1709621: log rec MLOG_REC_CLUST_DELETE_MARK len 21 page 0:308
            recv_group_scan_log_recs: ib_log: scan 1709642: log rec MLOG_FILE_DELETE len 19 page 4:0
            .......
            ........
            recv_group_scan_log_recs: ib_log: scan 1709753 completed
            innobase_init: ib_log: checkpoint 6 at 1707251 written
            innobase_init: ib_log: MLOG_CHECKPOINT(1707251) written at 1709762
            

            For this test case, assert was failing. Because we need to set FSP_SIZE only during apply of multiple mini-transaction not during parsing of
            multiple mtr records.

            The below patch solves the issue:

            diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
            index 01801dc..9108760 100644
            --- a/storage/innobase/log/log0recv.cc
            +++ b/storage/innobase/log/log0recv.cc
            @@ -2403,7 +2403,8 @@ recv_parse_log_rec(
                   }
             
                   if (*page_no == 0 && *type == MLOG_4BYTES
            -           && mach_read_from_2(old_ptr) == FSP_HEADER_OFFSET + FSP_SIZE) {
            +           && mach_read_from_2(old_ptr) == FSP_HEADER_OFFSET + FSP_SIZE
            +           && apply) {
                           old_ptr += 2;
             
                           ulint size = mach_parse_compressed(&old_ptr, end_ptr);
            

            marko approved the patch. So pushed the patch along with mdev-18194 fix to test it.

            thiru Thirunarayanan Balathandayuthapani added a comment - Somehow i felt the assert is too strict. So wrote a test case to confirm it. --source include/have_innodb.inc --source include/have_debug.inc   CREATE TABLE t1 (f1 int not null)engine=innodb; CREATE TABLE t2 (f1 int not null)engine=innodb; insert into t2 values(1), (2), (3); insert into t2 select * from t2; insert into t2 select * from t2; insert into t2 select * from t2; select count(*) from t2;   insert into t1 values(1), (2), (3); insert into t1 select * from t1; insert into t1 select * from t1; insert into t1 select * from t1; insert into t1 select * from t1; insert into t1 select * from t1; insert into t1 select * from t1; insert into t1 select * from t1; insert into t1 select * from t1; select count(*) from t1; insert into t1 select * from t2; insert into t1 values(1), (2), (3), (4), (5), (6), (7), (8), (9); set global innodb_log_checkpoint_now = true; --source ../include/no_checkpoint_start.inc --echo # Below insert triggers the increase of the fsp size insert into t1 values(1); drop table t1; --connection default --let CLEANUP_IF_CHECKPOINT=DROP TABLE t1; --source ../include/no_checkpoint_end.inc --source include/start_mysqld.inc --error 1146 select count(*) from t1; drop table t2; For the above test case, here is ib_log output: recv_group_scan_log_recs: ib_log: scan 1708026: log rec MLOG_UNDO_INSERT len 16 page 0:334 recv_group_scan_log_recs: ib_log: Before setting recv size 1707260 space id 4 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:0 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_1BYTE len 7 page 4:0 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:0 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_INIT_FILE_PAGE2 len 3 page 4:6 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:2 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_PAGE_CREATE len 3 page 4:6 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_2BYTES len 6 page 4:6 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_8BYTES len 10 page 4:6 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_REC_INSERT len 24 page 4:3 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:5 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 6 page 4:6 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_4BYTES len 10 page 4:6 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_COMP_REC_INSERT len 44 page 4:6 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log rec MLOG_FILE_NAME len 19 page 4:0 recv_group_scan_log_recs: ib_log: scan 1708058: multi-log end total_len 157 n=15 ............. .............. .............. recv_group_scan_log_recs: ib_log: scan 1709495: multi-log end total_len 17 n=3 recv_group_scan_log_recs: ib_log: scan 1709512: log rec MLOG_UNDO_INSERT len 36 page 0:335 recv_group_scan_log_recs: ib_log: scan 1709548: log rec MLOG_REC_CLUST_DELETE_MARK len 21 page 0:307 recv_group_scan_log_recs: ib_log: scan 1709585: log rec MLOG_UNDO_INSERT len 36 page 0:335 recv_group_scan_log_recs: ib_log: scan 1709621: log rec MLOG_REC_CLUST_DELETE_MARK len 21 page 0:308 recv_group_scan_log_recs: ib_log: scan 1709642: log rec MLOG_FILE_DELETE len 19 page 4:0 ....... ........ recv_group_scan_log_recs: ib_log: scan 1709753 completed innobase_init: ib_log: checkpoint 6 at 1707251 written innobase_init: ib_log: MLOG_CHECKPOINT(1707251) written at 1709762 For this test case, assert was failing. Because we need to set FSP_SIZE only during apply of multiple mini-transaction not during parsing of multiple mtr records. The below patch solves the issue: diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 01801dc..9108760 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -2403,7 +2403,8 @@ recv_parse_log_rec( }   if (*page_no == 0 && *type == MLOG_4BYTES - && mach_read_from_2(old_ptr) == FSP_HEADER_OFFSET + FSP_SIZE) { + && mach_read_from_2(old_ptr) == FSP_HEADER_OFFSET + FSP_SIZE + && apply) { old_ptr += 2;   ulint size = mach_parse_compressed(&old_ptr, end_ptr); marko approved the patch. So pushed the patch along with mdev-18194 fix to test it.

            Pushed as a part of mdev-18194

            thiru Thirunarayanan Balathandayuthapani added a comment - Pushed as a part of mdev-18194

            People

              thiru Thirunarayanan Balathandayuthapani
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.