Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22169

Recovery fails after failing to insert into mlog_init

Details

    Description

      Workflow of the RQG test
       
      1. Create DB server and start it
      2. Three session start to run some SQL stream against this server
           RQG code: INSERT IGNORE INTO _table ( _field_int ) VALUES ( _smallint_unsigned ) ;
           _table --> One of the tables created in SCHEMA test at begin of the RQG run
          _field_int -> diced column of type int
      3. At some point of time and not related to the state (waiting for result, sending statement, ...) of  the ongoing 2. the server process gets killed (KILL and not TERM).
      4. Make some copy of the data content+logs etc. of the killed server.
      5. Try to restart the killed server.
       
      5. fails with
      ...
      2020-04-06 12:44:07 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=29339963
      2020-04-06 12:44:22 0 [Note] InnoDB: Read redo log up to LSN=24178688
      2020-04-06 12:44:25 0 [Note] InnoDB: Starting a batch to recover 1429 pages from redo log.
      2020-04-06 12:44:37 0 [Note] InnoDB: To recover: 500 pages from log
      2020-04-06 12:44:38 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=7, page number=110]
      2020-04-06 12:44:38 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
       
      RQG
      git clone https://github.com/mleich1/rqg --branch experimental RQG_mleich1
      origin/experimental d417497b476e263428422b4640cf63b8e9d16afe 2020-03-30T17:17:54+02:00
       
      Mariadb
      origin/10.5 778a174e5e9cad846c47933fcf4727532487cab2 2020-04-06T10:47:11+03:00
      build with debug
       
      perl rqg.pl \                    
      --duration=300 \
      --queries=10000000 \
      --threads=3 \
      --no_mask \
      --seed=random \
      --gendata=conf/mariadb/oltp.zz \ 
      --engine=InnoDB \
      --rpl_mode=none \
      --mysqld=--innodb-lock-wait-timeout=50 \
      --mysqld=--log-output=none \
      --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
      --mysqld=--slave_net_timeout=60 \
      --mysqld=--loose-idle_write_transaction_timeout=0 \
      --mysqld=--innodb_stats_persistent=off \
      --mysqld=--interactive_timeout=28800 \
      --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      --mysqld=--net_write_timeout=60 \
      --mysqld=--lock-wait-timeout=86400 \
      --mysqld=--log_bin_trust_function_creators=1 \
      --mysqld=--loose-idle_transaction_timeout=0 \
      --mysqld=--connect_timeout=60 \
      --mysqld=--loose_innodb_use_native_aio=1 \
      --mysqld=--log-bin \
      --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      --mysqld=--wait_timeout=28800 \
      --mysqld=--net_read_timeout=30 \ 
      --mysqld=--innodb_page_size=8K \ 
      --mysqld=--innodb-buffer-pool-size=32M \
      --mysqld=--loose-table_lock_wait_timeout=50 \
      --reporters=CrashRecovery1,Deadlock1,ErrorLog,None,ServerDead \
      --validators=None \
      --grammar=rqg.yy \
      --workdir=<local settings> \
      --vardir=<local settings> \
      --mtr-build-thread=<local settings> \
      --basedir1=<local settings> \
      --script_debug=_nix_
       
      The uploaded RQG protocol "rqg.log" shows exact that RQG run.
      
      

      Attachments

        1. rqg.log
          82 kB
          Matthias Leich
        2. rqg.yy
          0.1 kB
          Matthias Leich

        Issue Links

          Activity

            Remainings with RR trace of the RQG run described above
            "sdp"
            /home/mleich/RQG_RR/storage/1586176150/000361.tgz

            mleich Matthias Leich added a comment - Remainings with RR trace of the RQG run described above "sdp" /home/mleich/RQG_RR/storage/1586176150/000361.tgz

            I painstakingly debugged the rr replay of the server that was killed before the recovery, and everything is fine there, as far as I can tell.

            The problem occurs at crash recovery. My computer recovered the saved dataset just fine (only complaining about an internal XA transaction that was recovered in the XA PREPARE state), even after I reduced to innodb_buffer_pool_size=32M.

            We have a rr replay of the failed recovery:

            cd /home/mleich/RQG_RR/storage/1586176150/tmp
            _RR_TRACE_DIR=dev/shm/vardir/1586176150/141/rr_trace rr replay
            

            We switch to STORE_NO mode, and then we will simply throw away the PAGE_FREE record that we successfully parsed for the problematic page:

            break log0recv.cc:2710
            break log0recv.cc:1946
            cond 2 id.m_id==0x70000006e
            break sql_print_error
            continue
            when
            continue
            when
            

            There will be no attempt to reread the redo log record. You should see that at when 1586 we switch to STORE_NO mode, and at when 1593 we throw away the FREE_PAGE record.

            The call to mlog_init.add() in recv_sys_t::add() will be skipped, because store==STORE_NO.

            As far as I can tell, there could be a similar problem with page initialization records already in 10.2. Before MDEV-15528, the impact should be less severe, only causing unnecessary reads of pages that would be completely initialized by redo log records, which MDEV-12699 intended to avoid (and MDEV-20688, MDEV-21572 fixed in some special cases).

            marko Marko Mäkelä added a comment - I painstakingly debugged the rr replay of the server that was killed before the recovery, and everything is fine there, as far as I can tell. The problem occurs at crash recovery. My computer recovered the saved dataset just fine (only complaining about an internal XA transaction that was recovered in the XA PREPARE state), even after I reduced to innodb_buffer_pool_size=32M . We have a rr replay of the failed recovery: cd /home/mleich/RQG_RR/storage/1586176150/tmp _RR_TRACE_DIR=dev/shm/vardir/1586176150/141/rr_trace rr replay We switch to STORE_NO mode, and then we will simply throw away the PAGE_FREE record that we successfully parsed for the problematic page: break log0recv.cc:2710 break log0recv.cc:1946 cond 2 id.m_id==0x70000006e break sql_print_error continue when continue when There will be no attempt to reread the redo log record. You should see that at when 1586 we switch to STORE_NO mode, and at when 1593 we throw away the FREE_PAGE record. The call to mlog_init.add() in recv_sys_t::add() will be skipped, because store==STORE_NO . As far as I can tell, there could be a similar problem with page initialization records already in 10.2. Before MDEV-15528 , the impact should be less severe, only causing unnecessary reads of pages that would be completely initialized by redo log records, which MDEV-12699 intended to avoid (and MDEV-20688 , MDEV-21572 fixed in some special cases).

            People

              marko Marko Mäkelä
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.