[MDEV-7882] Excessive transaction retry in parallel replication - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.0.17, 10.1.3
Fix Version/s: 10.0.18, 10.1.4
Component/s: Replication
Labels:
- parallelslave
- replication

Description

This problem was discovered as part of ~~MDEV-7847~~. But these are two logically
distinct problems (slave threads hanging vs. excessive transaction retry), so
filing a distinct bug to keep the separation.

If conflicting transactions T1 and T2 are run in parallel, then we may need to
deadlock kill T2 if it is holding a row lock that T1 needs. However, there is
no guarantee that T1 will get the lock when T2 is rolled back. If we are
unlucky, T2 may have time to re-take the lock, requiring another deadlock
kill.

In fact, in the scenario that discovered ~~MDEV-7847~~, as well as in testing
while working on that bug, we easily saw T2 ending up retrying 10 times, in
cases where there were many conflicting transactions executed in
parallel. This typically results in replication stopping with an error (10 is
the default maximum retries allowed).

In 10.1 "optimistic" mode, this problem is actually taken care of. After the
first deadlock kill of T2, it will execute wait_for_prior_commit() before
making a retry. This ensures that any earlier transactions that might conflict
will be allowed to get the locks and complete before the retry of T2, thus
avoiding the need for multiple retries.

So in "conservative" mode (and in 10.0), we should just do the same wait
before retry of T2. In conservative mode, conflicts are very rare, so there is
no performance considerations to not do it, and it avoids this potential
problem with excessive retries.

Attachments

Activity

Ascending order - Click to sort in descending order

Kristian Nielsen added a comment - 2015-03-30 15:25

http://lists.askmonty.org/pipermail/commits/2015-March/007682.html

Kristian Nielsen added a comment - 2015-03-30 15:25 http://lists.askmonty.org/pipermail/commits/2015-March/007682.html

Kristian Nielsen added a comment - 2015-03-30 17:50

Pushed to 10.0.18 and 10.1.4

Kristian Nielsen added a comment - 2015-03-30 17:50 Pushed to 10.0.18 and 10.1.4

People

Assignee:: Kristian Nielsen

Reporter:: Kristian Nielsen

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 2015-03-30 13:25

Updated:: 2015-03-30 17:50

Resolved:: 2015-03-30 17:50

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server