[MCOL-558] Possible memory leak in ExeMgr Created: 2017-02-09 Updated: 2019-03-06 Resolved: 2019-03-06 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ExeMgr |
| Affects Version/s: | 1.0.7 |
| Fix Version/s: | 1.2.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | Daniel Lee (Inactive) | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Environment: |
2-node combo stack on AWS, using 2d.8xlarge instances |
||
| Issue Links: |
|
||||||||
| Sprint: | 2017-15 | ||||||||
| Description |
|
Build tested: 1.0.7-1 This is was identified during the 1TB DBT3. More specifically, the following query was being tested: select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 313 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate LIMIT 100; Memory configuration: 244 gb memory in each server. TotalUMMemory was set to 50% and numBlock (PrimProc cache) was set to 40%. After starting ColumnStore, I executed the above query and that was the only query being process in the entire system. After the query was successfully completed, Only one of the 36 cores to churn at 100% (the rest were at 0%) for 4 minutes to free memory, from the peak of 54% it to 30% and remained at that level. Two questions: 1) Why would it take 4 minutes to free 24%, or 58.5gb of memory? 2) Why it remains at 30% utilization? When 20% of memory utilization remained, I executed the same query and ExeMgr memory utilization went up to 70% and disk swapping kicked in. After swap space exhausted, ColumnStore restarted itself. |
| Comments |
| Comment by David Thompson (Inactive) [ 2017-06-05 ] |
|
Should review / tackle |
| Comment by Andrew Hutchings (Inactive) [ 2017-07-27 ] |
|
Can you please try your test with 1.1.0 and see what you observe? I suspect this is fixed already. |
| Comment by David Thompson (Inactive) [ 2017-07-31 ] |
|
Moving into sprint for re-test to see if still the same. |
| Comment by Daniel Lee (Inactive) [ 2017-08-02 ] |
|
Build tested: Github source 1.1.0 (Built on 08/01/2017) In short, the suspected ExeMgr memory leak still exist. I am not sure how much recent enhancements have helped in ExeMgr memory utilization, if any. I looked at the orginal test case again and found that the memory configuration was too aggressive. The stack used for testing was a combo configuration. Having NumBlocksPct set at 40% and TotalUmMemory at 50%, there would be little memory left for the ExeMgr to store and compute other data. Therefore, before hash join memory gets to 50%, total memory already exausted and the system self-restarted. I can reproduced the reported issue with the following: ovh 1um2pm stack 500 gb dbt3 database I executed the same query 3 times, one after another. Each time after the query failed due to memory exceeding TotalUmMemory for hash join. After each failure, ExeMgr kept on processing (CPU utilization from the top command) for about 1 minutes while it's memory utilization kept on decreasing. As each query being processed, ExeMgr memory utilization kept on increasing by a few percentages, as shown in the following: befor any query execution: 0.01% right after 2nd query started: 12% right after 3rd query started: 14% |
| Comment by Andrew Hutchings (Inactive) [ 2017-08-02 ] |
|
In 1.1 amongst other things string storage usage should be a lot lower and the memory accounting for TotalUmMemory has been improved (mostly for aggregates though). I suspect the reason why memory doesn't reduce straight away is cancelling the PrimProc retrieval and ExeMgr threads takes time. Basically the query is still partly running. I'd need to get a PMP output to be sure. The interesting part is the remaining memory usage. This partly depends on how the memory was counted. I'll see if I can reproduce this. I don't have the storage space in my main server to do a 500GB test yet. I'll figure out a way of increasing the storage. |
| Comment by Andrew Hutchings (Inactive) [ 2017-08-09 ] |
|
Here is a quick summary of what is happening: When an error occurs DistributedEngineComm still has a lot of data in the bytestream buffers. This doesn't get cleared until the thread is reused by a new connection which could take a while, at which point old sessions are purged. The fix would be to purge the buffers on error but I haven't found a good way of doing this yet. As for the first question in the original description, this is because ExeMgr is looping through vectors of hundreds of thousands of small allocations to free them. We cannot do much about this without some architecture changes. |
| Comment by Daniel Lee (Inactive) [ 2019-03-06 ] |
|
Build verified: 1.2.1-1 No longer of the ovh6 hardware stack so used AWS m5.4xlarge instances instead. Over all memory utilization seem to have improved. Behavior, such as delayed memory release, remain the same. Close the ticket. |