[MCOL-834] PrimProc thread leak if ExeMgr dies Created: 2017-07-26  Updated: 2017-07-31  Resolved: 2017-07-31

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 1.0.10, 1.1.0

Type: Bug Priority: Major
Reporter: Andrew Hutchings (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 1
Labels: None

Sprint: 2017-15

 Description   

If the ExeMgr is processing queries and it dies then PrimProc can be left with a bunch of orphaned BPP threads.

How to reproduce this:

Step 1:

sudo cat /proc/`pidof PrimProc`/status | grep Threads
 
Threads:	117

Step 2: Run a bunch of concurrent queries
Step 3: Whilst the queries are running:

sudo kill -9 `pidof ExeMgr`

Step 4: Wait until CPU usage for PrimProc settles down to 0 and then re-run the thread count from step 1. If the bug exists then this will be higher (probably around 137). If it is fixed it will be similar to the first figure.

Every time ExeMgr is killed the leak count will increase which also means RAM is being leaked.



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2017-07-26 ]

Fix tracks the BPP objects for connection threads and frees them if the connection thread dies.

Comment by Andrew Hutchings (Inactive) [ 2017-07-27 ]

Fix has a race condition that we have been able to trigger about 1 in 5 runs.

Comment by Andrew Hutchings (Inactive) [ 2017-07-27 ]

2 pull requests for the regressions introduced.

Comment by Daniel Lee (Inactive) [ 2017-07-31 ]

Build verified: 1.0.10-1

Comment by Daniel Lee (Inactive) [ 2017-07-31 ]

Build verified: GitHub source 1.1.0

[root@b1pm1 mariadb-columnstore-server]# git show
commit 0831642d6edbf06d1fe44513f6606346d4737673
Merge: e475edf b8532f6
Author: david hill <david.hill@mariadb.com>
Date: Thu Jul 27 14:41:57 2017 -0500

[root@b1pm1 mariadb-columnstore-engine]# git show
commit 1ccb9676a4335af9bc8f6b15dc02e8d6658a8bb0
Merge: 470082d cc1cbaa
Author: David.Hall <david.hall@mariadb.com>
Date: Mon Jul 31 09:07:32 2017 -0500

Generated at Thu Feb 08 02:24:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.