Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20707

Missing memory barrier in parallel replication error handler in wait_for_prior_commit()

Details

    Description

      The 'wakeup' and 'wait code' in parallel replication looks as shown below.

      // Wakeup code in wait_for_commit::wakeup():
      mysql_mutex_lock(&LOCK_wait_commit);
      waitee= NULL;
      this->wakeup_error= wakeup_error;

      // Wait code in wait_for_prior_commit():
      if (waitee)
      return wait_for_prior_commit2(thd);
      else
      {
      if (wakeup_error)
      my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));

      So the waiter code runs a "fast path" without locks. It is ok if we race on
      the assignment of NULL to wait_for_commit::waitee variable, because then the
      waiter will take the slow path and do proper locking.

      But it looks like there is a race as follows:

      1. wakeup() sets waitee= NULL
      2. wait_for_prior_commit() sees waitee==NULL and wakeup_error==0, and
      incorrectly returns without error.
      3. wakeup() too late sets wait_for_commit::wakeup_error.

      It is not enough of course to swap the assignments in wakeup(). A
      write-write memory barrier is needed between them in wakeup(), and a
      corresponding read-read barrier is needed in wait_for_prior_commit().

      With proper barriers, the waiter cannot see the write of waitee=NULL without
      also seeing the write to wakeup_error. So it will either return with
      non-zero wakeup_error or take the slow path with proper locking. Both of
      which are fine.

      Attachments

        Activity

          sujatha.sivakumar Sujatha Sivakumar (Inactive) added a comment - Hello Andrei, Please review the contributed changes for MDEV-20707 . Patch: https://github.com/MariaDB/server/commit/4eb6ea77e2051bf50b68567f82862f5726fd4bd7 BuildBot Testing: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.4-sujatha Thank you.
          Elkin Andrei Elkin added a comment -

          Thanks for working on it!

          Elkin Andrei Elkin added a comment - Thanks for working on it!

          Fix is implemented in 10.4.11.

          sujatha.sivakumar Sujatha Sivakumar (Inactive) added a comment - Fix is implemented in 10.4.11.

          People

            sujatha.sivakumar Sujatha Sivakumar (Inactive)
            sujatha.sivakumar Sujatha Sivakumar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.