[MDEV-32792] Second Semi-sync Replica Can Hang at Connect Time Created: 2023-11-13  Updated: 2023-11-13

Status: Open
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2
Fix Version/s: 10.4, 10.5, 10.6, 10.11, 11.0, 11.1

Type: Bug Priority: Major
Reporter: Brandon Nesterenko Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-32551 "Read semi-sync reply magic number er... Closed

 Description   

When adding a second semi-sync replica to a primary, it will hang at connect time (without receiving any events), until the primary receives an ACK from the first (existing) replica. This can be indefinite if no new transactions occur on the primary.

This is because the Ack_receiver thread locks m_mutex while it calls listener.listen_on_sockets(). Then when the new dupm connection is initializing, it tries to lock m_mutex in Ack_receiver::add_slave(), and will hang due to the Ack_receivers hold on it.


Generated at Thu Feb 08 10:34:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.