Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31362

recv_sys_t::apply(bool): Assertion `!last_batch || recovered_lsn == scanned_lsn' failed

Details

    Description

      mleich provided a copy of a data directory as well as rr replay traces that leads to a recovery failure with an assertion failure.

      Unfortunately, the data directory enables encryption and includes a 96 MiB ib_logfile0 that wrapped around once. Because encrypted data does not compress well, a compressed copy of the data directory would be too large to attach here.

      The rr replay trace from before the crash is of limited use, because rr replay would report a replay divergence near the end of the trace. Right before the 10.6 version of MDEV-29911, recovery will appear to succeed:

      10.6 1fe830b56a2bd9b12b643d7b39417255215ae5da

      2023-05-29 16:45:43 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=142804091,143054310
      2023-05-29 16:45:43 0 [Note] InnoDB: Starting final batch to recover 174 pages from redo log.
      2023-05-29 16:45:43 0 [Note] InnoDB: Last binlog file './mysql-bin.000001', position 2691230
      2023-05-29 16:45:43 0 [Note] InnoDB: Removing encryption and resizing redo log from 100663296 to 4294967296 bytes; LSN=151202977
      

      If I attempt recovery with the fix of MDEV-29911, I will hit a debug assertion instead:

      10.6 f2c17cc9d9bcd634887846d3064bcb71243f9cc0

      2023-05-29 16:47:48 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=142804091,143054310
      mariadbd: /mariadb/10.6/storage/innobase/log/log0recv.cc:3629: void recv_sys_t::apply(bool): Assertion `!last_batch || recovered_lsn == scanned_lsn' failed.
      

      During the time of the assertion failure, we have recv_sys.recovered_lsn==151202728 and recv_sys.scanned_lsn==151209984. The largest observed *contiguous_lsn in recv_scan_log_recs() is 150516736.

      In the non-crashing run, recovery only proceeded up to 151202977. This would seem to suggest that the "successful" run without MDEV-29911 may be incorrect.

      MariaDB Server 10.8 and later versions could be unaffected by this exact bug, because the separate log block and log record parsers were unified when MDEV-14425 replaced the 512-byte log blocks with mini-transaction-sized log blocks. Other recovery bugs are possible; in MDEV-31353 there is a recent example.

      Attachments

        Issue Links

          Activity

            I conducted some analysis of this. Here are the the first 16 bytes of each 512-byte blocks of the redo log, up to the logical end:

            od -Ax -j 0x3028000 -N 0x1200 -t x1 /dev/shm/data/ib_logfile0 |grep '[02468ace]00 '
            

            3028000 00 04 81 9d da dd c7 3b 5e ae 5f ba 89 0e 25 bc
            3028200 00 04 81 9e 8b 28 29 5c 9a 51 0a a3 89 0f 7c bb
            3028400 00 04 81 9f 56 28 fa bd d7 11 eb 41 a7 ae 03 b0
            3028600 00 04 81 a0 bd 5c b8 64 7e e3 9b e5 0c 0d b1 50
            3028800 00 04 81 a1 a6 1b 13 bf 3a 1c 4b 2a a1 94 b8 04
            3028a00 00 04 81 a2 97 88 bd 93 08 29 9b 4d 8e e8 e5 93
            3028c00 00 04 81 a3 bb 74 21 d6 a8 b1 82 cd 81 e9 8d 1a
            3028e00 00 04 81 a4 82 db 86 a5 34 28 ef 74 d2 15 99 cb
            3029000 00 01 81 a9 9e f4 ed 68 cb 62 bc 42 3a f9 5d eb
            

            At the file offset 0x3029000 we can find an earlier log block from the first "round" when the 96MiB ib_logfile0 had not yet wrapped around.

            The last log_crypt() call before the assertion failure is on the block that corresponds to the file offset 0x3028e00:

            10.6 ce547cfc0564bf54e73bcc5171a2212d6650eb00

            (rr) finish
            Run till exit from #0  log_crypt (buf=buf@entry=0x7feb15468200 "", 
                lsn=151209472, size=size@entry=512, op=op@entry=LOG_DECRYPT)
                at /mariadb/10.6/storage/innobase/log/log0crypt.cc:134
            log_t::file::read_log_seg (this=0x557b4b594440 <log_sys+512>, 
                start_lsn=0x7ffe5fbaa690, end_lsn=151258112)
                at /mariadb/10.6/storage/innobase/log/log0recv.cc:1615
            1615			if (is_encrypted()
            Value returned is $4 = true
            (rr) p/x *(char*)0x7feb15468200@16
            $5 = {0x0, 0x4, 0x81, 0xa4, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7, 0x20, 0x20, 0x20, 0x20}
            

            The two bytes right after the log block number are the payload length: 0x200=512 bytes. At the time of the assertion failure, recv_sys.scanned_lsn is 151209984, which is exactly 151209472+512. The recv_sys.recovered_lsn is less than that, only 151202728, because the last parsed mini-transaction was incomplete.

            In the log resizing message that I posted yesterday, we can see LSN=151202977. The difference of 249 bytes could be explained by some FILE_MODIFY records and a FILE_CHECKPOINT record that must have been written to the old log as part of a log checkpoint before the log could be safely resized.

            In other words, the debug assertion is bogus. On a closer read of the 10.6 version of MDEV-29911, I see that this assertion was added as part of the commit, with no corresponding earlier assertion. In the 10.9 version of MDEV-29911, the logic is simpler, because there only is a log record or mini-transaction parser and no separate log block.

            I think that the simplest fix is to remove the debug assertion from 10.6.

            marko Marko Mäkelä added a comment - I conducted some analysis of this. Here are the the first 16 bytes of each 512-byte blocks of the redo log, up to the logical end: od -Ax -j 0x3028000 -N 0x1200 -t x1 /dev/shm/data/ib_logfile0 |grep '[02468ace]00 ' 3028000 00 04 81 9d da dd c7 3b 5e ae 5f ba 89 0e 25 bc 3028200 00 04 81 9e 8b 28 29 5c 9a 51 0a a3 89 0f 7c bb 3028400 00 04 81 9f 56 28 fa bd d7 11 eb 41 a7 ae 03 b0 3028600 00 04 81 a0 bd 5c b8 64 7e e3 9b e5 0c 0d b1 50 3028800 00 04 81 a1 a6 1b 13 bf 3a 1c 4b 2a a1 94 b8 04 3028a00 00 04 81 a2 97 88 bd 93 08 29 9b 4d 8e e8 e5 93 3028c00 00 04 81 a3 bb 74 21 d6 a8 b1 82 cd 81 e9 8d 1a 3028e00 00 04 81 a4 82 db 86 a5 34 28 ef 74 d2 15 99 cb 3029000 00 01 81 a9 9e f4 ed 68 cb 62 bc 42 3a f9 5d eb At the file offset 0x3029000 we can find an earlier log block from the first "round" when the 96MiB ib_logfile0 had not yet wrapped around. The last log_crypt() call before the assertion failure is on the block that corresponds to the file offset 0x3028e00: 10.6 ce547cfc0564bf54e73bcc5171a2212d6650eb00 (rr) finish Run till exit from #0 log_crypt (buf=buf@entry=0x7feb15468200 "", lsn=151209472, size=size@entry=512, op=op@entry=LOG_DECRYPT) at /mariadb/10.6/storage/innobase/log/log0crypt.cc:134 log_t::file::read_log_seg (this=0x557b4b594440 <log_sys+512>, start_lsn=0x7ffe5fbaa690, end_lsn=151258112) at /mariadb/10.6/storage/innobase/log/log0recv.cc:1615 1615 if (is_encrypted() Value returned is $4 = true (rr) p/x *(char*)0x7feb15468200@16 $5 = {0x0, 0x4, 0x81, 0xa4, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7, 0x20, 0x20, 0x20, 0x20} The two bytes right after the log block number are the payload length: 0x200=512 bytes. At the time of the assertion failure, recv_sys.scanned_lsn is 151209984, which is exactly 151209472+512. The recv_sys.recovered_lsn is less than that, only 151202728, because the last parsed mini-transaction was incomplete. In the log resizing message that I posted yesterday, we can see LSN=151202977. The difference of 249 bytes could be explained by some FILE_MODIFY records and a FILE_CHECKPOINT record that must have been written to the old log as part of a log checkpoint before the log could be safely resized. In other words, the debug assertion is bogus. On a closer read of the 10.6 version of MDEV-29911 , I see that this assertion was added as part of the commit, with no corresponding earlier assertion. In the 10.9 version of MDEV-29911 , the logic is simpler, because there only is a log record or mini-transaction parser and no separate log block. I think that the simplest fix is to remove the debug assertion from 10.6.

            Relevant for triggering this bogus debug assertion should be that the server was killed while it was writing multiple redo log blocks. We can observe that the last written block ends at a 4096-byte boundary of the file. I assume that most of the time when the server is killed, the last written redo log block would also be recovered as the last block.

            marko Marko Mäkelä added a comment - Relevant for triggering this bogus debug assertion should be that the server was killed while it was writing multiple redo log blocks. We can observe that the last written block ends at a 4096-byte boundary of the file. I assume that most of the time when the server is killed, the last written redo log block would also be recovered as the last block.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.