[MDEV-33551] Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.5, 10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
Fix Version/s: 10.11.8, 10.6.18, 11.0.6, 11.1.5, 11.2.4
Component/s: Replication
Labels:
None

Description

Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce to about 10% of AFTER_SYNC's performance for workloads with many concurrent users executing transactions. See the attached graph, where the "Async" bar reports the performance of a primary using asynchronous replication, the "Semi-sync AFTER_COMMIT" and "Semi-sync AFTER_SYNC" reports the performance of the primary using the respective rpl_semi_sync_master_wait_point variable, and "Semi-sync GROUP_ACK" is a prototype of the work outlined by MDEV-33491.

It can be seen that the performance of the AFTER_COMMIT mode of semi-sync is many times worse than that of the AFTER_SYNC mode. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is recieved from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in a mutual exclusion zone. See the attached Semi-sync Group Ack Proposal PDF for more details.

Instead, each connection should use its own condition variable (e.g. THD::COND_wakeup_ready), and the ACK receiver thread should only signal connections which have been ACKed for wakeup.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Semi-sync Group Ack Proposal.pdf
1.11 MB
2024-02-27 13:33
semisync_patched.png
13 kB
2024-02-27 13:23
semisync_mdev33551_patched.png
36 kB
2024-03-21 18:34
semisync_mdev_33551.png
13 kB
2024-02-28 18:11

Issue Links

causes

MDEV-34122 Assertion `entry' failed in Active_tranx::assert_thd_is_waiter

Closed

MDEV-36359 Crash after disabling semi sync + start a master switchover process

Stalled

duplicates

MDEV-34462 SemiSync replication underperforming and stalling throughput

Closed

relates to

MDEV-33491 Semi-sync Replication Group ACK

Stalled

Activity

Ascending order - Click to sort in descending order

Michael Widenius added a comment - 2024-02-27 15:19

The solution Kristian and I come up with is:

Let the group-leader thread handle also the AFTER_COMMIT, in which case there is only one thread waiting for the ACK.
Instead of requesting an ACK for each commit, only request an ACK for the whole group.
Why not going with the this solution? It will be much faster and use less bandwidth (as there is only one ACK per group).

Michael Widenius added a comment - 2024-02-27 15:19 The solution Kristian and I come up with is: Let the group-leader thread handle also the AFTER_COMMIT, in which case there is only one thread waiting for the ACK. Instead of requesting an ACK for each commit, only request an ACK for the whole group. Why not going with the this solution? It will be much faster and use less bandwidth (as there is only one ACK per group).

Brandon Nesterenko added a comment - 2024-02-28 18:11 - edited

Hi knielsen!

This is ready for review: PR 3089.

The following graph shows the speedup with this patch (the MDEV_33551_AFTER_COMMIT bar):

Brandon Nesterenko added a comment - 2024-02-28 18:11 - edited Hi knielsen ! This is ready for review: PR 3089 . The following graph shows the speedup with this patch (the MDEV_33551_AFTER_COMMIT bar):

Brandon Nesterenko added a comment - 2024-02-28 18:16

monty the semi-sync group ACK work is tracked by MDEV-33491 (with its description updated per our last discussion)

Brandon Nesterenko added a comment - 2024-02-28 18:16 monty the semi-sync group ACK work is tracked by MDEV-33491 (with its description updated per our last discussion)

Brandon Nesterenko added a comment - 2024-03-08 21:40

knielsen, I've addressed your review comments from the mailing list, and force-pushed to PR 3089 with the latest updates. Can you give it another review?

Brandon Nesterenko added a comment - 2024-03-08 21:40 knielsen , I've addressed your review comments from the mailing list, and force-pushed to PR 3089 with the latest updates. Can you give it another review?

Brandon Nesterenko added a comment - 2024-03-21 18:33 - edited

Pushed into 10.6 as 75c7c6dc3.

Merge conflict observed in 10.11 with a fix in branch bb-10.11-MDEV-33551-mergefix.

Looks like another merge conflict may exist in sql/sql_class.h in 11.X with chunk:

   bool check_slave_ignored_db_with_error(const Lex_ident_db &db) const;

/*

    Indicates if this thread is suspended due to awaiting an ACK from a

    replica. True if suspended, false otherwise.

    Note that this variable is protected by Repl_semi_sync_master::LOCK_binlog

*/

  bool is_awaiting_semisync_ack;

This patch removed the is_awaiting_semisync_ack variable, so to resolve, just remove that variable.

Otherwise no merge issues should be present.

Final performance figures as follows:

In summary, this patch changes the way that semi-sync connections wait for ACKs as follows. Previously, when an ACK would arrive from a replica, all user connection threads currently awaiting an ACK for its transaction would be awoken, but each thread could only check if the ACK was for it one at a time. With the patch, instead of waking up all threads awaiting ACKs, only those which are relevant to the received ACK are awoken.

As an example, say there are 500 transactions on the primary server awaiting an ACK. Previously, when an ACK would come in for the next transaction in the binlog, all 500 threads would be signalled to wake up and check if that ACK was for it, yet each thread could only check one at a time (not in parallel), and only one of the 500 could move on (the other 499 would go back to waiting for their respective ACK). However, with the patch, now only the thread associated with that ACK (or potentially any earlier binlogged transactions) are awoken.

Brandon Nesterenko added a comment - 2024-03-21 18:33 - edited Pushed into 10.6 as 75c7c6dc3 . Merge conflict observed in 10.11 with a fix in branch bb-10.11-MDEV-33551-mergefix . Looks like another merge conflict may exist in sql/sql_class.h in 11.X with chunk: bool check_slave_ignored_db_with_error(const Lex_ident_db &db) const; /* Indicates if this thread is suspended due to awaiting an ACK from a replica. True if suspended, false otherwise. Note that this variable is protected by Repl_semi_sync_master::LOCK_binlog */ bool is_awaiting_semisync_ack; This patch removed the is_awaiting_semisync_ack variable, so to resolve, just remove that variable. Otherwise no merge issues should be present. Final performance figures as follows: In summary, this patch changes the way that semi-sync connections wait for ACKs as follows. Previously, when an ACK would arrive from a replica, all user connection threads currently awaiting an ACK for its transaction would be awoken, but each thread could only check if the ACK was for it one at a time. With the patch, instead of waking up all threads awaiting ACKs, only those which are relevant to the received ACK are awoken. As an example, say there are 500 transactions on the primary server awaiting an ACK. Previously, when an ACK would come in for the next transaction in the binlog, all 500 threads would be signalled to wake up and check if that ACK was for it, yet each thread could only check one at a time (not in parallel), and only one of the 500 could move on (the other 499 would go back to waiting for their respective ACK). However, with the patch, now only the thread associated with that ACK (or potentially any earlier binlogged transactions) are awoken.

MariaDB Server

Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration