Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29999

innodb_undo_log_truncate=ON is not crash safe

    XMLWordPrintable

Details

    Description

      Undo log truncation is still buggy, even after MDEV-26450 was fixed.

      while ./mtr --rr=-h --parallel=auto --mysqld=--innodb-purge-threads=1 innodb.undo_truncate_recover,4k,strict_full_crc32{,,,,,,,,,,}; do :; done
      

      10.6 67f4ba240e5fa5f95ab4f1688bdb18f61e7d4d7f

      innodb.undo_truncate_recover '2,4k,innodb,strict_full_crc32' w4 [ fail ]  Found warnings/errors in server log file!
              Test ended at 2022-11-11 09:51:35
      line
      2022-11-11  9:51:34 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=2, page number=74]
      2022-11-11  9:51:34 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-11-11  9:51:34 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=2, page number=74]; set innodb_force_recovery to ignore
      2022-11-11  9:51:34 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1505] with error Data structure corruption
      2022-11-11  9:51:34 0 [ERROR] InnoDB: Failed to read page 74 from file './/undo002': Page read from tablespace is corrupted.
      2022-11-11  9:51:34 0 [ERROR] Plugin 'InnoDB' init function returned error.
      2022-11-11  9:51:34 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
      

      Crash recovery will read an older version of the page. The checkpoint LSN was equal to the commit LSN of mtr.commit_shrink(space) in trx_purge_truncate_history(). So, the error is only on the recovery side.

      Recovery invoked recv_sys_t::trim() on page 640, but in fact all pages in the undo log tablespace will be reinitialized, and hence all redo log records for the tablespace before the truncation LSN need to be discarded.

      This failure has also occurred on our CI systems a couple of times:

      10.10 d371e35257c45895318c4efbed20e5bfdcc7cee9

      innodb.undo_truncate_recover '2,4k,innodb,strict_full_crc32' w4 [ fail ]  Found warnings/errors in server log file!
              Test ended at 2022-06-18 12:14:25
      line
      2022-06-18 12:14:25 0 [ERROR] InnoDB: OPT_PAGE_CHECKSUM mismatch on [page id: space=2, page number=74]
      2022-06-18 12:14:25 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
      2022-06-18 12:14:25 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1243] with error Data structure corruption
      2022-06-18 12:14:25 0 [ERROR] InnoDB: Failed to read page 74 from file './/undo002': Page read from tablespace is corrupted.
      2022-06-18 12:14:25 0 [ERROR] Plugin 'InnoDB' init function returned error.
      2022-06-18 12:14:25 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
      ^ Found warnings in /buildbot/amd64-ubuntu-2004-msan/build/mysql-test/var/4/log/mysqld.1.err
      ok
       
       - saving '/buildbot/amd64-ubuntu-2004-msan/build/mysql-test/var/4/log/innodb.undo_truncate_recover-2,4k,innodb,strict_full_crc32/' to '/buildbot/amd64-ubuntu-2004-msan/build/mysql-test/var/log/innodb.undo_truncate_recover-2,4k,innodb,strict_full_crc32/'
      

      We can thank MDEV-18976 and https://rr-project.org for finding this bug. Before it, the failure looked like this:

      10.6 660cfe4782380bbb10ab844014b82f4d5c2fee50

      innodb.undo_truncate_recover '2,4k,innodb,strict_crc32' w2 [ fail ]
              Test ended at 2021-12-12 18:54:02
       
      CURRENT_TEST: innodb.undo_truncate_recover
       
       
      Server [mysqld.1 - pid: 191888, winpid: 191888, exit: 256] failed during test run
      Server log from this test:
      ----------SERVER LOG START-----------
      $ /buildbot/amd64-ubuntu-1804-msan/build/sql/mariadbd --defaults-group-suffix=.1 --defaults-file=/buildbot/amd64-ubuntu-1804-msan/build/mysql-test/var/2/my.cnf --log-output=file --innodb --innodb-cmpmem --innodb-cmp-per-index --innodb-trx --innodb-locks --innodb-lock-waits --innodb-metrics --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --innodb-sys-columns --innodb-sys-fields --innodb-sys-foreign --innodb-sys-foreign-cols --innodb-sys-indexes --innodb-sys-tables --innodb-sys-virtual --innodb-page-size=4K --skip-innodb-read-only-compressed --innodb-checksum-algorithm=strict_crc32 --innodb-undo-tablespaces=2 --innodb-purge-threads=1 --core-file --loose-debug-sync-timeout=300
      2021-12-12 18:53:30 0 [Note] /buildbot/amd64-ubuntu-1804-msan/build/sql/mariadbd (server 10.6.6-MariaDB-debug-log) starting as process 190177 ...
      2021-12-12 18:54:01 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=20546035,20621018
      2021-12-12 18:54:01 0 [Note] InnoDB: Opened 2 undo tablespaces
      2021-12-12 18:54:01 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
      2021-12-12 18:54:01 0 [Note] InnoDB: Starting final batch to recover 1816 pages from redo log.
      mariadbd: /buildbot/amd64-ubuntu-1804-msan/build/storage/innobase/log/log0recv.cc:275: log_phys_t::apply_status log_phys_t::apply(const buf_block_t &, uint16_t &) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.
      

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.