[MDEV-33551] Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.5, 10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
Fix Version/s: 10.11.8, 10.6.18, 11.0.6, 11.1.5, 11.2.4
Component/s: Replication
Labels:
None

Description

Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce to about 10% of AFTER_SYNC's performance for workloads with many concurrent users executing transactions. See the attached graph, where the "Async" bar reports the performance of a primary using asynchronous replication, the "Semi-sync AFTER_COMMIT" and "Semi-sync AFTER_SYNC" reports the performance of the primary using the respective rpl_semi_sync_master_wait_point variable, and "Semi-sync GROUP_ACK" is a prototype of the work outlined by MDEV-33491.

It can be seen that the performance of the AFTER_COMMIT mode of semi-sync is many times worse than that of the AFTER_SYNC mode. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is recieved from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in a mutual exclusion zone. See the attached Semi-sync Group Ack Proposal PDF for more details.

Instead, each connection should use its own condition variable (e.g. THD::COND_wakeup_ready), and the ACK receiver thread should only signal connections which have been ACKed for wakeup.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

semisync_mdev_33551.png
13 kB
2024-02-28 18:11
semisync_mdev33551_patched.png
36 kB
2024-03-21 18:34
semisync_patched.png
13 kB
2024-02-27 13:23
Semi-sync Group Ack Proposal.pdf
1.11 MB
2024-02-27 13:33

Issue Links

causes

MDEV-34122 Assertion `entry' failed in Active_tranx::assert_thd_is_waiter

Closed

MDEV-36359 Crash after disabling semi sync + start a master switchover process

Stalled

duplicates

MDEV-34462 SemiSync replication underperforming and stalling throughput

Closed

relates to

MDEV-33491 Semi-sync Replication Group ACK

Stalled

Activity

Ascending order - Click to sort in descending order

Brandon Nesterenko created issue - 2024-02-27 13:23

Brandon Nesterenko made changes - 2024-02-27 13:23

Field	Original Value	New Value
Link		This issue relates to MDEV-33491 [ MDEV-33491 ]

Brandon Nesterenko made changes - 2024-02-27 13:23

Attachment

semisync_patched.png [ 73209 ]

Brandon Nesterenko made changes - 2024-02-27 13:25

Status

Open [ 1 ]

In Progress [ 3 ]

Brandon Nesterenko made changes - 2024-02-27 13:33

Attachment

Semi-sync Group Ack Proposal.pdf [ 73210 ]

Brandon Nesterenko made changes - 2024-02-27 13:34

Description

Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce to about 10% of AFTER_SYNC's performance for workloads with many concurrent users executing transactions. See the attached graph, where the "Async" bar reports the performance of a primary using asynchronous replication, the "Semi-sync AFTER_COMMIT" and "Semi-sync AFTER_SYNC" reports the performance of the primary using the respective rpl_semi_sync_master_wait_point variable, and "Semi-sync GROUP_ACK" is a prototype of the work outlined by MDEV-33491.

It can be seen that the performance of the AFTER_COMMIT mode of semi-sync is many times worse than that of the AFTER_SYNC mode. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is recieved from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in a mutual exclusion zone.

Instead, each connection should use its own condition variable (e.g. THD::COND_wakeup_ready), and the ACK receiver thread should only signal connections which have been ACKed for wakeup.

Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency

When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce to about 10% of AFTER_SYNC's performance for workloads with many concurrent users executing transactions. See the attached graph, where the "Async" bar reports the performance of a primary using asynchronous replication, the "Semi-sync AFTER_COMMIT" and "Semi-sync AFTER_SYNC" reports the performance of the primary using the respective rpl_semi_sync_master_wait_point variable, and "Semi-sync GROUP_ACK" is a prototype of the work outlined by MDEV-33491.

It can be seen that the performance of the AFTER_COMMIT mode of semi-sync is many times worse than that of the AFTER_SYNC mode. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is recieved from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in a mutual exclusion zone. See the attached Semi-sync Group Ack Proposal PDF for more details.

Instead, each connection should use its own condition variable (e.g. THD::COND_wakeup_ready), and the ACK receiver thread should only signal connections which have been ACKed for wakeup.

Brandon Nesterenko made changes - 2024-02-28 18:11

Assignee	Brandon Nesterenko [ JIRAUSER48702 ]	Kristian Nielsen [ knielsen ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Brandon Nesterenko made changes - 2024-02-28 18:11

Attachment

semisync_mdev_33551.png [ 73216 ]

Brandon Nesterenko made changes - 2024-03-21 18:26

Assignee

Kristian Nielsen [ knielsen ]

Brandon Nesterenko [ JIRAUSER48702 ]

Brandon Nesterenko made changes - 2024-03-21 18:33

Fix Version/s		10.6.18 [ 29627 ]
Fix Version/s	10.6 [ 24028 ]
Fix Version/s	10.11 [ 27614 ]
Fix Version/s	11.0 [ 28320 ]
Fix Version/s	11.1 [ 28549 ]
Fix Version/s	11.3 [ 28565 ]
Fix Version/s	11.2 [ 28603 ]
Resolution		Fixed [ 1 ]
Status	In Review [ 10002 ]	Closed [ 6 ]

JiraAutomate made changes - 2024-03-21 18:33

Fix Version/s		10.11.8 [ 29630 ]
Fix Version/s		11.0.6 [ 29628 ]
Fix Version/s		11.1.5 [ 29629 ]
Fix Version/s		11.2.4 [ 29631 ]

Brandon Nesterenko made changes - 2024-03-21 18:34

Attachment

semisync_mdev33551_patched.png [ 73298 ]

Elena Stepanova made changes - 2024-05-08 20:26

Link

This issue causes ~~MDEV-34122~~ [ ~~MDEV-34122~~ ]

Jira Automation (IT) made changes - 2024-07-04 00:27

Zendesk Related Tickets

164270

Kristian Nielsen made changes - 2024-08-13 07:20

Link

This issue duplicates ~~MDEV-34462~~ [ ~~MDEV-34462~~ ]

Kristian Nielsen made changes - 2025-03-24 14:34

Link

This issue causes MDEV-36359 [ MDEV-36359 ]

Ralf Gebhardt made changes - 1 week ago

Labels

crash

Ralf Gebhardt made changes - 1 week ago

Labels

crash

People

Assignee:: Brandon Nesterenko

Reporter:: Brandon Nesterenko

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2024-02-27 13:23

Updated:: 6 days ago 23:08

Resolved:: 2024-03-21 18:33

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration