Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35618

Bogus assertion failure 'recv_sys.scanned_lsn < max_lsn + 32 * 512U' during recovery

    XMLWordPrintable

Details

    Description

      mleich reproduced the following assertion failure while testing MDEV-35494:

      10.6-MDEV-35049 877d75a8f49ef4170fc701b091d4303ab788d440

      mariadbd: /data/Server/10.6-MDEV-35049/storage/innobase/buf/buf0dblwr.cc:366: void buf_dblwr_t::recover(): Assertion `recv_sys.scanned_lsn < max_lsn + 32 * 512U' failed.
      

      The assertion fails, because recv_sys.scanned_lsn is 0x4109 bytes ahead of recv_sys.recovered_lsn. The limit would be 0x4000 bytes, or the maximum value of innodb_log_write_ahead_size.

      At the time the server was killed before this recovery attempt, it was in the middle of a log write during the commit of a native ALTER TABLE operation:

      10.6-MDEV-35049 877d75a8f49ef4170fc701b091d4303ab788d440

      #0  __libc_pwrite64 (fd=0x7, buf=0x40f46152b000, count=0x64a00, offset=0x26df600) at ../sysdeps/unix/sysv/linux/pwrite64.c:24
      #1  0x000064f2fbbd72c0 in SyncFileIO::execute (this=this@entry=0x2e6104922bf0, request=@0x2e6104922bd0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::WRITE_SYNC})
          at /data/Server/10.6-MDEV-35049/storage/innobase/os/os0file.cc:691
      #2  0x000064f2fbbd8387 in os_file_io (in_type=@0x64f2fc2e81a0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::WRITE_SYNC}, file=file@entry=0x7, buf=buf@entry=0x40f46152b000, n=n@entry=0x64a00, 
          offset=offset@entry=0x26df600, err=err@entry=0x2e6104922e4c) at /data/Server/10.6-MDEV-35049/storage/innobase/os/os0file.cc:2787
      #3  0x000064f2fbbd8514 in os_file_pwrite (type=@0x64f2fc2e81a0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::WRITE_SYNC}, file=file@entry=0x7, buf=buf@entry=0x40f46152b000 "\200\r6\371\002", 
          n=0x64a00, offset=0x26df600, err=err@entry=0x2e6104922e4c) at /data/Server/10.6-MDEV-35049/storage/innobase/os/os0file.cc:2862
      #4  0x000064f2fbbd876a in os_file_write_func (type=@0x64f2fc2e81a0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::WRITE_SYNC}, name=<optimized out>, file=0x7, buf=0x40f46152b000, 
          offset=<optimized out>, n=<optimized out>) at /data/Server/10.6-MDEV-35049/storage/innobase/os/os0file.cc:2891
      #5  0x000064f2fbba4c42 in file_os_io::write (this=<optimized out>, path=<optimized out>, offset=<optimized out>, buf=<optimized out>) at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:289
      #6  0x000064f2fbba4eb4 in log_file_t::write (this=this@entry=0x64f2fd248360 <log_sys+544>, offset=offset@entry=0x26df600, buf={data_ = 0x40f46152b000 "\200\r6\371\002", size_ = 0x64a00})
          at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:352
      #7  0x000064f2fbba5f0b in log_t::file::write (this=this@entry=0x64f2fd248340 <log_sys+512>, offset=offset@entry=0x26df600, buf={data_ = 0x40f46152b000 "\200\r6\371\002", size_ = 0x64a00})
          at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:403
      #8  0x000064f2fbba62e0 in log_write_buf (buf=buf@entry=0x40f46152b000 "\200\r6\371\002", len=len@entry=0x64a00, pad_len=pad_len@entry=0x200, start_lsn=0x1a6df000, new_data_offset=new_data_offset@entry=0xd6)
          at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:521
      #9  0x000064f2fbba64bc in log_write (rotate_key=rotate_key@entry=0x0) at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:662
      #10 0x000064f2fbba6b42 in log_write_and_flush () at /data/Server/10.6-MDEV-35049/storage/innobase/log/log0log.cc:782
      ...
      

      Of the 0x64a00-byte write request, only 0x14a00 had reached the ib_logfile0. The log_sys.lsn at the time of the SIGKILL is 0x1a743662-0x1a6ef8f6 = 0x53d6c bytes ahead of the recv_sys.recovered_lsn.

      The failing debug assertion in the 10.6 branch is too strict. The tolerance needs to be RECV_PARSING_BUF_SIZE, that is, the maximum size of a mini-transaction, or 2 megabytes.

      Note: This bug only affects the 10.6 branch. In later maintained branches, MDEV-14425 removed the log block parser, and the two recv_sys.recovered_lsn and recv_sys.scanned_lsn were merged to one recv_sys.lsn.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.