Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-37553

Assertion failure lsn - get_flushed_lsn(std::memory_order_relaxed) < capacity()

    XMLWordPrintable

Details

    • Not for Release Notes
    • An overly aggressive debug assertion was relaxed.

    Description

      After merging MDEV-36024 from 10.11 to 11.4, the test encryption.innochecksum,8k would fail in at least two environments as follows:

      2025-09-03  7:53:43 4 [ERROR] InnoDB: Crash recovery is broken due to insufficient innodb_log_file_size; last checkpoint LSN=64055, current LSN=9490181.
      mariadbd: /mariadb/11.4/storage/innobase/mtr/mtr0mtr.cc:914: void log_t::append_prepare_wait(bool, bool): Assertion `lsn - get_flushed_lsn(std::memory_order_relaxed) < capacity()' failed.
      

      The test is provoking the error by debug instrumentation. The assertion that was introduced in MDEV-21923 fails to take the error condition (log_sys.overwrite_warned > 0) into account. In other words, this is an overly strict assertion. With the following patch, the test completes fine:

      diff --git a/storage/innobase/mtr/mtr0mtr.cc b/storage/innobase/mtr/mtr0mtr.cc
      index a5b70a4684b..7c195a39ab6 100644
      --- a/storage/innobase/mtr/mtr0mtr.cc
      +++ b/storage/innobase/mtr/mtr0mtr.cc
      @@ -911,7 +911,8 @@ ATTRIBUTE_COLD void log_t::append_prepare_wait(bool late, bool ex) noexcept
           const bool is_pmem{is_mmap()};
           if (is_pmem)
           {
      -      ut_ad(lsn - get_flushed_lsn(std::memory_order_relaxed) < capacity());
      +      ut_ad(lsn - get_flushed_lsn(std::memory_order_relaxed) < capacity() ||
      +            overwrite_warned);
             persist(lsn);
           }
       #endif
      

      I suspect that the cleanup of mtr_buf_t in MDEV-36024 changed the timing of the context switches in such a way that this condition became more reachable.

      I reviewed that there should be no risk of writing outside the bounds of the memory-mapped buffer log_sys.buf. The debug assertion is related to log_overwrite_warning() and specific to cmake -DWITH_INNODB_PMEM=ON.

      Between the time log_overwrite_warning() and log_t::write_checkpoint() were called, the contents of the memory-mapped log_sys.buf (and the ib_logfile0) is basically unrecoverable garbage, and it does not matter which write was last persisted. The assertion is normally trying to make sure that pmem_persist() is being called frequently enough, for the unlikely case that the entire log file would wrap around without there being any calls to pmem_persist(). The debug instrumentation in the test is preventing log checkpoints, and there are no transaction commits either that would trigger a call to log_t::persist().

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Marko Mäkelä Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.