[MCOL-1758] TupleJoiner allocates too much RAM Created: 2018-10-01 Updated: 2019-11-27 Resolved: 2019-11-27 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 1.4.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | Andrew Hutchings (Inactive) | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Attachments: |
|
| Sprint: | 2019-06 |
| Description |
|
Every TupleJoiner thread pre-allocates 64MB using our STLPoolAllocator class. With our thread pool that can spawn many threads very quickly pre-allocating a lot of memory. In some systems this will easily blow the configured overcommit. In my tests running 20 simple simultaneous queries it hit 32GB of allocation with only about 1% of that actually used. Our STLPoolAllocator class also isn't a pool (it isn't even thread safe so it can't be). We would be likely better off using boost::pool_allocator or another off-the-shelf singleton pool allocator. |
| Comments |
| Comment by Andrew Hutchings (Inactive) [ 2018-10-01 ] | |||||
|
Attached stack.pdf
Captured using:
| |||||
| Comment by Patrick LeBlanc (Inactive) [ 2019-03-11 ] | |||||
|
Be sure to benchmark whatever sol'n is chosen. IIRC, in the pool allocator it's using right now, allocation just means moving a pointer. An off-the-shelf allocator will likely slow things down. How much is the question. Performance is important here since the caller is trying to allocate potentially millions of little things. An alternative if the benchmarks aren't great. I'd suggest we have it start with a small window size (maybe 1MB), and have it grow up to some max (64MB) as more is allocated. fwiw, 64MB was chosen b/c it is a threshold value in the default linux allocator (this was before we switched to jemalloc). IIRC, above that value, it will bypass most of the allocator's logic and allocate a segment using sbrk(), which was a performance gain for its use cases. Since we're now using jemalloc in CS, we could experiment with reducing that #, or get fancy. | |||||
| Comment by Patrick LeBlanc (Inactive) [ 2019-09-13 ] | |||||
|
went ahead and fixed it, just needed to not explicitly tell it to allocate 64MB per chunk. By default it allocates 4096 (page size) * sizeof(element), which is much less. Will check for performance issues then check it in... | |||||
| Comment by Patrick LeBlanc (Inactive) [ 2019-11-01 ] | |||||
|
This, and the new PM hash table functionality, are in | |||||
| Comment by Daniel Lee (Inactive) [ 2019-11-27 ] | |||||
|
Build tested: 1.2.5-1, 1.4.1-1 for comparison Build verified: 1.4.1-1 stack: single node Executed the query 20 times simultaneously on a 10g dbt3 database, on both 1.2.5-1 to 1.4.1-1. Heap memory utilization info are captured in the attached pdf files. |