Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28043

Race condition between mtr_t::commit() and checkpoint

    XMLWordPrintable

    Details

      Description

      When MDEV-27774 replaced log_sys.mutex with log_sys.latch, it introduced a race condition in mtr_t::do_write():

          if (!ex)
          {
            log_sys.latch.rd_unlock();
            log_sys.latch.wr_lock(SRW_LOCK_CALL);
            if (UNIV_LIKELY(!m_user_space->max_lsn))
              name_write();
            std::pair<lsn_t,mtr_t::page_flush_ahead> p{finish_write(len, true)};
            log_sys.latch.wr_unlock();
            log_sys.latch.rd_lock(SRW_LOCK_CALL);
            return p;
          }
      

      It is not safe to release the exclusive log_sys.latch between finish_write() and ReleaseBlocks. Because we have no portable operation that would downgrade the latch from exclusive to shared, we must retain that exclusive latch until the end of the critical section in mtr_t::commit().

      I debugged an rr replay trace of this:

      ssh pluto
      rr replay /data/results/1647008467/TBR-1420/dev/shm/rqg/1647008467/53/1/rr/latest-trace
      

      continue
      watch -l log_sys.last_checkpoint_lsn.m._M_i
      watch -l buf_pool.flush_list.count
      reverse-continue
      reverse-continue
      reverse-continue
      thread apply 24 backtrace
      

      From the end of the start, we have Thread 3 hitting an assertion failure:

      mysqld: /data/Server/bb-10.9-MDEV-26603-async-redo-writeB/storage/innobase/buf/buf0flu.cc:1877: bool log_checkpoint_low(lsn_t, lsn_t): Assertion `oldest_lsn > log_sys.last_checkpoint_lsn' failed.
      

      Before that, we had Thread 24 inserting the unexpectedly old block to buf_pool.flush_list, and before that, Thread 3 updating the checkpoint LSN to the too new value.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.