Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35618

Bogus assertion failure 'recv_sys.scanned_lsn < max_lsn + 32 * 512U' during recovery

Details

    Description

      mleich reproduced the following assertion failure while testing MDEV-35494:

      10.6-MDEV-35049 877d75a8f49ef4170fc701b091d4303ab788d440

      mariadbd: /data/Server/10.6-MDEV-35049/storage/innobase/buf/buf0dblwr.cc:366: void buf_dblwr_t::recover(): Assertion `recv_sys.scanned_lsn < max_lsn + 32 * 512U' failed.
      

      The assertion fails, because recv_sys.scanned_lsn is 0x4109 bytes ahead of recv_sys.recovered_lsn. The limit would be 0x4000 bytes, or the maximum value of innodb_log_write_ahead_size.

      At the time the server was killed before this recovery attempt, it was in the middle of a log write during the commit of a native ALTER TABLE operation:

      10.6-MDEV-35049 877d75a8f49ef4170fc701b091d4303ab788d440

      #0  __libc_pwrite64 (fd=0x7, buf=0x40f46152b000, count=0x64a00, offset=0x26df600) at ../sysdeps/unix/sysv/linux/pwrite64.c:24
      #1  0x000064f2fbbd72c0 in SyncFileIO::execute (this=this@entry=0x2e6104922bf0, request=@0x2e6104922bd0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::WRITE_SYNC})
          at /data/Server/10.6-MDEV-35049/storage/innobase/os/os0file.cc:691
      #2  0x000064f2fbbd8387 in os_file_io (in_type=@0x64f2fc2e81a0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::WRITE_SYNC}, file=file@entry=0x7, buf=buf@entry=0x40f46152b000, n=n@entry=0x64a00, 
          offset=offset@entry=0x26df600, err=err@entry=0x2e6104922e4c) at /data/Server/10.6-MDEV-35049/storage/innobase/os/os0file.cc:2787
      #3  0x000064f2fbbd8514 in os_file_pwrite (type=@0x64f2fc2e81a0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::WRITE_SYNC}, file=file@entry=0x7, buf=buf@entry=0x40f46152b000 "\200\r6\371\002", 
          n=0x64a00, offset=0x26df600, err=err@entry=0x2e6104922e4c) at /data/Server/10.6-MDEV-35049/storage/innobase/os/os0file.cc:2862
      #4  0x000064f2fbbd876a in os_file_write_func (type=@0x64f2fc2e81a0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::WRITE_SYNC}, name=<optimized out>, file=0x7, buf=0x40f46152b000, 
          offset=<optimized out>, n=<optimized out>) at /data/Server/10.6-MDEV-35049/storage/innobase/os/os0file.cc:2891
      #5  0x000064f2fbba4c42 in file_os_io::write (this=<optimized out>, path=<optimized out>, offset=<optimized out>, buf=<optimized out>) at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:289
      #6  0x000064f2fbba4eb4 in log_file_t::write (this=this@entry=0x64f2fd248360 <log_sys+544>, offset=offset@entry=0x26df600, buf={data_ = 0x40f46152b000 "\200\r6\371\002", size_ = 0x64a00})
          at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:352
      #7  0x000064f2fbba5f0b in log_t::file::write (this=this@entry=0x64f2fd248340 <log_sys+512>, offset=offset@entry=0x26df600, buf={data_ = 0x40f46152b000 "\200\r6\371\002", size_ = 0x64a00})
          at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:403
      #8  0x000064f2fbba62e0 in log_write_buf (buf=buf@entry=0x40f46152b000 "\200\r6\371\002", len=len@entry=0x64a00, pad_len=pad_len@entry=0x200, start_lsn=0x1a6df000, new_data_offset=new_data_offset@entry=0xd6)
          at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:521
      #9  0x000064f2fbba64bc in log_write (rotate_key=rotate_key@entry=0x0) at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:662
      #10 0x000064f2fbba6b42 in log_write_and_flush () at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:782
      ...
      

      Of the 0x64a00-byte write request, only 0x14a00 had reached the ib_logfile0. The log_sys.lsn at the time of the SIGKILL is 0x1a743662-0x1a6ef8f6 = 0x53d6c bytes ahead of the recv_sys.recovered_lsn.

      The failing debug assertion in the 10.6 branch is too strict. The tolerance needs to be RECV_PARSING_BUF_SIZE, that is, the maximum size of a mini-transaction, or 2 megabytes.

      Note: This bug only affects the 10.6 branch. In later maintained branches, MDEV-14425 removed the log block parser, and the two recv_sys.recovered_lsn and recv_sys.scanned_lsn were merged to one recv_sys.lsn.

      Attachments

        Issue Links

          Activity

            Would the following patch (adjusting this assertion expression) pass the test scenario?

            diff --git a/storage/innobase/buf/buf0dblwr.cc b/storage/innobase/buf/buf0dblwr.cc
            index 31036bf978d..35da1780102 100644
            --- a/storage/innobase/buf/buf0dblwr.cc
            +++ b/storage/innobase/buf/buf0dblwr.cc
            @@ -363,7 +363,7 @@ void buf_dblwr_t::recover()
               innodb_log_write_ahead_size. After MDEV-14425 eliminated
               OS_FILE_LOG_BLOCK_SIZE, these two LSN must be equal. */
               ut_ad(recv_sys.scanned_lsn >= max_lsn);
            -  ut_ad(recv_sys.scanned_lsn < max_lsn + 32 * OS_FILE_LOG_BLOCK_SIZE);
            +  ut_ad(recv_sys.scanned_lsn < max_lsn + RECV_PARSING_BUF_SIZE);
             
               uint32_t page_no_dblwr= 0;
               byte *read_buf= static_cast<byte*>(aligned_malloc(3 * srv_page_size,
            

            marko Marko Mäkelä added a comment - Would the following patch (adjusting this assertion expression) pass the test scenario? diff --git a/storage/innobase/buf/buf0dblwr.cc b/storage/innobase/buf/buf0dblwr.cc index 31036bf978d..35da1780102 100644 --- a/storage/innobase/buf/buf0dblwr.cc +++ b/storage/innobase/buf/buf0dblwr.cc @@ -363,7 +363,7 @@ void buf_dblwr_t::recover() innodb_log_write_ahead_size. After MDEV-14425 eliminated OS_FILE_LOG_BLOCK_SIZE, these two LSN must be equal. */ ut_ad(recv_sys.scanned_lsn >= max_lsn); - ut_ad(recv_sys.scanned_lsn < max_lsn + 32 * OS_FILE_LOG_BLOCK_SIZE); + ut_ad(recv_sys.scanned_lsn < max_lsn + RECV_PARSING_BUF_SIZE); uint32_t page_no_dblwr= 0; byte *read_buf= static_cast<byte*>(aligned_malloc(3 * srv_page_size,

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.