Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
Description
This is an idea to improve the parallelism and thus speed of (aggressive)
parallel replication. A patch implementing the idea is attached.
In the current code, when a transaction conflicts and gets deadlock killed,
it will wait for all prior transactions to complete before retrying. The
idea is to avoid such transactions repeatedly causing conflicts and
incurring multiple rollbacks; in current code, at most one rollback and
retry per transaction will happen in normal circumstances.
But this can be overly pessimistic. For example, suppose in a stream of
transactions T1, T2, ... we have conflicts from T4, T9, T16, and T19. Then
the transactions will (most likely) end up committing in these groups and
throughput will be limited by the serial execution of the 4 conflicting
transactions:
- T1 T2 T3
- T4 T5 T6 T7 T8
- T9 T10 T11 T12 T13 T14 T15
- T16 T17 T18
- T19
Suppose T4 conflicts with T1. If we retry T4 as soon as T1 commits, it can
potentially run concurrently with T2 and T3, allowing more parallelism. The
cost is that T4 may then later conflict with T2 or T3, requiring another
rollback and retry.
As another example, suppose T9 and T16 conflict with T3 and T2. Then all
of T4, T9, and T16 can potentially retry in parallel, which is not possible in
current code.
[An even more aggressive approach would be to not even wait for T1 to commit
first before retrying T4. We know that T4 will not be able to complete until
after T1 (as they have a lock conflict), but we might be able to run the
initial part of T4 before it reaches the conflicting lock and has to wait
for T1. This however requires some mechanism to guarantee that T1 will get
the conflicting lock before T4 can race ahead and get the lock (re-)granted;
otherwise the two threads could live-lock with each other. This more
aggressive approach is not considered in this patch.]
This approach becomes more interesting as the number of parallel threads
grows and with increasing amounts of conflicts. If transaction T200
conflicts with T100, waiting for T199 to commit before retrying T200 can
take significantly longer than waiting for T100. On modern servers with
large number of cores, this can potentially greatly speed up parallel
replication when the wait dependencies between transactions become the
bottleneck, rather than the amount of CPU time available for event
execution.
I have attached a RFC patch that implements this approach. I tested it with
an artificial (mtr-based) test with large (>20%) amount of conflicts, and
saw a modest but significant speedup.
But to implement this idea properly, it should really be systematically
benchmarked, using several different workloads that are intelligently varied
to examine the interesting phase space in terms of parallelism, hardware
capabilities, and conflict percentage.
Attachments
Issue Links
- is part of
-
MDEV-37582 Reduce Slave Lag
-
- Open
-