Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.4(EOL)
Description
The 'wakeup' and 'wait code' in parallel replication looks as shown below.
// Wakeup code in wait_for_commit::wakeup():
mysql_mutex_lock(&LOCK_wait_commit);
waitee= NULL;
this->wakeup_error= wakeup_error;
// Wait code in wait_for_prior_commit():
if (waitee)
return wait_for_prior_commit2(thd);
else
{
if (wakeup_error)
my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));
So the waiter code runs a "fast path" without locks. It is ok if we race on
the assignment of NULL to wait_for_commit::waitee variable, because then the
waiter will take the slow path and do proper locking.
But it looks like there is a race as follows:
1. wakeup() sets waitee= NULL
2. wait_for_prior_commit() sees waitee==NULL and wakeup_error==0, and
incorrectly returns without error.
3. wakeup() too late sets wait_for_commit::wakeup_error.
It is not enough of course to swap the assignments in wakeup(). A
write-write memory barrier is needed between them in wakeup(), and a
corresponding read-read barrier is needed in wait_for_prior_commit().
With proper barriers, the waiter cannot see the write of waitee=NULL without
also seeing the write to wakeup_error. So it will either return with
non-zero wakeup_error or take the slow path with proper locking. Both of
which are fine.