[MCOL-1062] High concurrency can lock up PrimProc Created: 2017-11-30  Updated: 2018-01-22  Resolved: 2018-01-22

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 1.1.3

Type: Bug Priority: Major
Reporter: Andrew Hutchings (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: File create_t1.sql     File t.tbl.gz    
Issue Links:
Relates
relates to MCOL-1128 exemgr becomes non responsive Closed
Sprint: 2018-01, 2018-02

 Description   

On my 32vcore machine running 64 parallel queries has a high probability of completely locking up PrimProc.

My theory is the threads are requesting access to the thread pool for BPP jobs but the thread pool is already maxed out. Causing a wait deadlock.

Test is using:

mysqlslap -c64 --number-of-queries=640 --query="select dt, count(*) from test.t1 group by dt" --user=root --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock"

Test schema and data will be attached shortly.



 Comments   
Comment by David Hall (Inactive) [ 2018-01-12 ]

Theoretically, no more than 20 (default) queries should be running simultaneously. This is enforced by the statementsRunningCount instance of class ActiveStatementCounter (ExeMgr main.cpp). The max parallel queries is controlled in ExeMgr by
int getEmExecQueueSize() const

{ return getIntVal(fExeMgrStr, "ExecQueueSize", defaultEMExecQueueSize); }

A value of 0 implies unlimited.

Unfortunately, until the fix for 1128, getEmExecQueueSize() used getUintVal(), which rejects values of 0 for some unknown design reason.

This doesn't mean that the problem isn't in the PM. Most queries require multiple threads in the PM.

Comment by David Hall (Inactive) [ 2018-01-12 ]

This bug appears to be fixed when MCOL-1128 was fixed.

Comment by Daniel Lee (Inactive) [ 2018-01-22 ]

Build verified: 1.1.3-1 Github source

/root/columnstore/mariadb-columnstore-server
commit e0ae0d2fecf9941887478d9aa669c8b2d1092090
Merge: 21ec50194e 2490ddf50e
Author: benthompson15 <ben.thompson@mariadb.com>
Date: Fri Jan 19 12:39:05 2018 -0600

Merge pull request #84 from mariadb-corporation/MCOL-1159

MCOL-1159 Merge mariadb-10.2.12

/root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
commit c74d5de21d6571c0b0e9a12dacaf77856d332e63
Merge: 201813d6 63adbd0f
Author: benthompson15 <ben.thompson@mariadb.com>
Date: Mon Jan 22 09:42:34 2018 -0600

Merge pull request #375 from mariadb-corporation/dev-1.1-build-fix

Fix missing compiler flag from 1.0 -> 1.1 merge

Average number of seconds to run all queries: 315.900 seconds
Minimum number of seconds to run all queries: 315.900 seconds
Maximum number of seconds to run all queries: 315.900 seconds
Number of clients running queries: 64
Average number of queries per client: 10

Generated at Thu Feb 08 02:25:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.