Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31442

Assertion ‘n & PENDING’ failed in fil_space_t::set_needs_flush()

Details

    Description

      This debug assertion failure was observed while testing MDEV-14795, but it could affect older releases as well.

      The scenario is as follows:

      1. A write of a page of a tablespace has been initiated.
      2. Something is suspending further I/O on the tablespace (fil_space_t::set_stopping()). This could be as part of executing DROP TABLE or anything that rebuilds a table (TRUNCATE TABLE, OPTIMIZE TABLE, ALTER TABLE), or possibly as part of executing innodb_undo_log_truncate=ON.
      3. The buf_flush_page_cleaner thread encounters some more dirty pages for the table in buf_pool.flush_list, but notices that all further writes to the tablespace need to be discarded. It will release a tablespace reference that it did not hold in the first place.
      4. write_io_callback calls IORequest::write_complete() on the previously submitted write, which will hit the debug assertion failure, catching the corruption of the reference count and flags.

      The following patch fixes this so that the tablespace reference will be incremented even though the flags are set.

      diff --git a/storage/innobase/include/fil0fil.h b/storage/innobase/include/fil0fil.h
      index 10365d167b7..10c084bbbf1 100644
      --- a/storage/innobase/include/fil0fil.h
      +++ b/storage/innobase/include/fil0fil.h
      @@ -1526,9 +1526,11 @@ template<bool have_reference> inline void fil_space_t::flush()
           flush_low();
         else
         {
      -    if (!(acquire_low() & (STOPPING | CLOSING)))
      +    if (!(acquire_low(STOPPING | CLOSING) & (STOPPING | CLOSING)))
      +    {
             flush_low();
      -    release();
      +      release();
      +    }
         }
       }
       
      

      The impact of this for non-debug builds should be that the reference count wraps around (from 0 to 536,870,911) and the 3 flags are "decremented".

      I believe that this should lead to InnoDB hanging at some point, and possibly a missed fdatasync() or fsync() call (which should not matter much, because the file is typically going to be deleted anyway).

      Attachments

        Activity

          Thank you. I suggested this patch. Once it has been tested, please push it to 10.5.

          marko Marko Mäkelä added a comment - Thank you. I suggested this patch. Once it has been tested, please push it to 10.5.

          The fix 199f0d6ccc1900af3efc32b5892261872fc1fea0 looks good to me.

          vlad.lesin Vladislav Lesin added a comment - The fix 199f0d6ccc1900af3efc32b5892261872fc1fea0 looks good to me.

          People

            thiru Thirunarayanan Balathandayuthapani
            thiru Thirunarayanan Balathandayuthapani
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.