Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31362

recv_sys_t::apply(bool): Assertion `!last_batch || recovered_lsn == scanned_lsn' failed

    XMLWordPrintable

Details

    Description

      mleich provided a copy of a data directory as well as rr replay traces that leads to a recovery failure with an assertion failure.

      Unfortunately, the data directory enables encryption and includes a 96 MiB ib_logfile0 that wrapped around once. Because encrypted data does not compress well, a compressed copy of the data directory would be too large to attach here.

      The rr replay trace from before the crash is of limited use, because rr replay would report a replay divergence near the end of the trace. Right before the 10.6 version of MDEV-29911, recovery will appear to succeed:

      10.6 1fe830b56a2bd9b12b643d7b39417255215ae5da

      2023-05-29 16:45:43 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=142804091,143054310
      2023-05-29 16:45:43 0 [Note] InnoDB: Starting final batch to recover 174 pages from redo log.
      2023-05-29 16:45:43 0 [Note] InnoDB: Last binlog file './mysql-bin.000001', position 2691230
      2023-05-29 16:45:43 0 [Note] InnoDB: Removing encryption and resizing redo log from 100663296 to 4294967296 bytes; LSN=151202977
      

      If I attempt recovery with the fix of MDEV-29911, I will hit a debug assertion instead:

      10.6 f2c17cc9d9bcd634887846d3064bcb71243f9cc0

      2023-05-29 16:47:48 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=142804091,143054310
      mariadbd: /mariadb/10.6/storage/innobase/log/log0recv.cc:3629: void recv_sys_t::apply(bool): Assertion `!last_batch || recovered_lsn == scanned_lsn' failed.
      

      During the time of the assertion failure, we have recv_sys.recovered_lsn==151202728 and recv_sys.scanned_lsn==151209984. The largest observed *contiguous_lsn in recv_scan_log_recs() is 150516736.

      In the non-crashing run, recovery only proceeded up to 151202977. This would seem to suggest that the "successful" run without MDEV-29911 may be incorrect.

      MariaDB Server 10.8 and later versions could be unaffected by this exact bug, because the separate log block and log record parsers were unified when MDEV-14425 replaced the 512-byte log blocks with mini-transaction-sized log blocks. Other recovery bugs are possible; in MDEV-31353 there is a recent example.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.