[MCOL-2089] High CPU usage and slow performance appears when load data with remote mcsimport Created: 2019-01-15 Updated: 2023-10-26 Resolved: 2019-03-22 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | None |
| Affects Version/s: | 1.2.2 |
| Fix Version/s: | 1.2.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Zdravelina Sokolovska (Inactive) | Assignee: | Jens Röwekamp (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
mcsimport tool run remotely to mcs single server |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Sprint: | 2019-02, 2019-03 | ||||||||||||
| Description |
|
High CPU usage and slow performance appears when load data with remote mcsimport run autopilot cpimportLineitem test case group with option mcsimport .All test passed how to repeat:
during all time of data loading with mcsimport was observed high cpu usage
trace get during the loading of EC test
|
| Comments |
| Comment by Dipti Joshi (Inactive) [ 2019-01-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Please update the "Affected Version" field in the jira item winstone | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jens Röwekamp (Inactive) [ 2019-02-15 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Made mcsimport multi threaded. Performance gain is around 25% compared to the single threaded 1.2.2 implementation of mcsimport. Test suite successfully executed on Windows 10 against a remote CS 1.2.2-1 instance on CentOS 7. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jens Röwekamp (Inactive) [ 2019-02-15 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For QA:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jens Röwekamp (Inactive) [ 2019-02-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've extended my tests / profiling to also examine the performance impact of multi-threaded mcsimport on Linux operating systems. They differ from the results for Windows. First test case with CentOS 7 and Ubuntu 18.04 in a Virtual Box environment Installed kernels: Virtual Box tests against ColumnStore 1.2.2-1 on VMs with 8GiB of memory and 4 cores and 8 threads. [host maximum]
In an over-threaded setup, the single threaded mcsimport outperforms the multi-threaded. Except on Ubuntu 18.04; it seems to be able to deal with over-threaded setups and shows a similar performance as in the optimal case with 2 cores and 4 threads. It also shows this behaviour in the over-threaded buildbot sample. The CentOS 7 compiler difference is marginal. Virtual Box tests against ColumnStore 1.2.2-1 on VMs with 8GiB of memory and 2 cores and 4 threads.
This seems to be the optional test case setup for multi-threaded. There is one thread for CS and three threads for mcsimport. Virtual Box tests against ColumnStore 1.2.2-1 on VMs with 8GiB of memory and 1 core and 2 threads.
Not suprisingly, in an under-threaded machine the single threaded mcsimport outperforms the multi-threaded. Second test case - buildbot execution times of load_test_2
This shows us that the single threaded mcsimport outperforms the multi-threaded mcsimport on every OS except Ubuntu 18.04 during Third test case - mcsimport injection from Windows 10
This shows us that there is a performance difference of around 23% only based on the choice of operating system used for ColumnStore. Fourth test case - mcsapi compiler / optimizer impact
This shows us that the choice of C++ compiler and optimizer option can have a around 5% effect on the performance. My conclusion:
TL/DR: We can get 24.5% optimization right away by enabling -O3 for single threaded mcsimport. We could squeeze out 20% more performance if we use pipelining and figure out why the performance degrades while executing on over-threaded Linux operating systems (except Ubuntu 18.04). We also have to find a solution to minimize the performance degradation while executed on under-threaded operating systems. My suggestion: Merge PR 34 and close PR 33 with the note that over-threaded and under-threaded environments need to be considered better. Then move | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jens Röwekamp (Inactive) [ 2019-03-08 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Attached logs verify that the multi threaded implementation of mcsimport has potential, but currently is still slower than the single threaded implementation on some operating systems. Therefore, as indicated above the single threaded optimizations will be patched into 1.2.3 and the multi threaded implementation will be postponed to 1.2.4. It will be documented in | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Zdravelina Sokolovska (Inactive) [ 2019-03-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
1.2.2
1.2.3
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Zdravelina Sokolovska (Inactive) [ 2019-03-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
issue is reopened as the test results on 1.2.3 show not well improved mcsimport performance ,under 10% from the 1.2.2 value | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Hutchings (Inactive) [ 2019-03-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
That is all the performance improvements we are going to get out of this ticket. The rest is being tracked in other tickets. |