[MDEV-14505] Threads_running becomes scalability bottleneck Created: 2017-11-27 Updated: 2019-01-17 Resolved: 2017-12-13 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Server |
| Affects Version/s: | 5.5, 10.0, 10.1, 10.2, 10.3 |
| Fix Version/s: | 10.3.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Sergey Vojtovich | Assignee: | Sergey Vojtovich |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | performance | ||
| Issue Links: |
|
||||||||
| Epic Link: | arm64 optimization | ||||||||
| Description |
|
thread_running is global shared variable, which is updated twice per query in dispatch_command() using atomic-add. It started to appear on radar in OLTP RO benchmark. E.g here're numbers for 2 socket/20 cores/40 threads Intel Broadwell system:
If we remove inc_thread_running() and dec_thread_running() we get slightly better throughput and dispatch_command() goes down in profiler:
So bottleneck is shifted to global_query_id counter, which is subject for another bug. I expect much higher scalability impact on more powerful hardware. |
| Comments |
| Comment by Sergey Vojtovich [ 2017-11-27 ] |
|
ssethia, dthompson FYI: I added this under MDEV-14442, because it may have negative impact for ARM as well. Nevertheless it is generic scalability bottleneck, so feel free to move it out of scope of arm64 optimisations. |
| Comment by Sergey Vojtovich [ 2017-12-12 ] |
|
Patch was approved by wlad. |