[MDEV-17346] parallel slave start and stop races to workers disappeared Created: 2018-10-02  Updated: 2020-08-25  Resolved: 2018-10-08

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.1
Fix Version/s: 10.1.37

Type: Bug Priority: Major
Reporter: Andrei Elkin Assignee: Andrei Elkin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-9573 'Stop slave' hangs on replication slave Closed

 Description   

The bug appears as a slave SQL thread hanging in
rpl_parallel_thread_pool::get_thread() while there are no slave worker
threads to awake it.

The reason of the hang is that at the parallel slave worker pool
activation the being stared SQL thread could read the worker pool size
concurrently with pool deactivation. At reading the SQL thread did not
employ necessary protection from a race.

One way to fix it is to make the SQL thread at the pool activation first
to grab the same lock as potential deactivator also does prior
to access the pool size.



 Comments   
Comment by Kristian Nielsen [ 2018-10-05 ]

Review done, patch is ok to push:

https://lists.launchpad.net/maria-developers/msg11447.html

Comment by Andrei Elkin [ 2018-10-08 ]

Pushed as f517d8c7425.

Generated at Thu Feb 08 08:35:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.