[MDEV-28114] Semi-sync Master ACK Receiver Thread Can Error on COM_QUIT Created: 2022-03-17  Updated: 2023-10-23  Resolved: 2022-04-22

Status: Closed
Project: MariaDB Server
Component/s: Replication
Fix Version/s: 10.4.25, 10.5.16, 10.6.8, 10.7.4, 10.8.3

Type: Task Priority: Major
Reporter: Brandon Nesterenko Assignee: Brandon Nesterenko
Resolution: Fixed Votes: 0
Labels: None

Attachments: PNG File semisync_shutdown_crash.png    
Issue Links:
Relates
relates to MDEV-11853 semisync thread can be killed after s... Closed
relates to MDEV-32551 "Read semi-sync reply magic number er... Closed

 Description   

A semi-sync master can sometimes error when it is issued SHUTDOWN WAIT FOR ALL SLAVES as a semi-sync slave is stopping its IO thread. If the slave's repl_semisync_slave::slave_stop() executes as or after the master stops listening for connections, the slave's semi-sync connection can stay active, and follow up with a mysql_close() on that connection, thereby issuing COM_QUIT on an active semi-sync connection. The ACK receiver thread sees this and fails with "[ERROR] Read semi-sync reply magic number error". See the attached image for a visualization of the issue.



 Comments   
Comment by Brandon Nesterenko [ 2022-04-22 ]

Fixed as a part of MDEV-11853 work.

Comment by Andrei Elkin [ 2023-01-17 ]

The bug is actually fixed start from 10.4.25. It's a continuation part of MDEV-18450, MDEV-11853 fixes series.

Generated at Thu Feb 08 09:58:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.