Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency
When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce to about 10% of AFTER_SYNC's performance for workloads with many concurrent users executing transactions. See the attached graph, where the "Async" bar reports the performance of a primary using asynchronous replication, the "Semi-sync AFTER_COMMIT" and "Semi-sync AFTER_SYNC" reports the performance of the primary using the respective rpl_semi_sync_master_wait_point variable, and "Semi-sync GROUP_ACK" is a prototype of the work outlined by MDEV-33491.
It can be seen that the performance of the AFTER_COMMIT mode of semi-sync is many times worse than that of the AFTER_SYNC mode. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is recieved from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in a mutual exclusion zone. See the attached Semi-sync Group Ack Proposal PDF for more details.
Instead, each connection should use its own condition variable (e.g. THD::COND_wakeup_ready), and the ACK receiver thread should only signal connections which have been ACKed for wakeup.