Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL)
Description
thread_running is global shared variable, which is updated twice per query in dispatch_command() using atomic-add.
It started to appear on radar in OLTP RO benchmark. E.g here're numbers for 2 socket/20 cores/40 threads Intel Broadwell system:
0,70% mysqld [.] dispatch_command
|
...
|
│ inline void thread_safe_increment32(int32 *value)
|
│ {
|
│ (void) my_atomic_add32_explicit(value, 1, MY_MEMORY_ORDER_RELAXED);
|
0,59 │ 680: lea thread_running,%rax
|
16,07 │ lock addl $0x1,(%rax)
|
If we remove inc_thread_running() and dec_thread_running() we get slightly better throughput and dispatch_command() goes down in profiler:
0,58% mysqld [.] dispatch_command
|
...
|
│ /* increment query_id and return it. */
|
│ inline __attribute__((warn_unused_result)) query_id_t next_query_id()
|
│ {
|
│ return my_atomic_add64_explicit(&global_query_id, 1, MY_MEMORY_ORDER_RELAXED);
|
0,49 │ 290: mov $0x1,%edx
|
18,02 │ lock xadd %rdx,(%r14)
|
So bottleneck is shifted to global_query_id counter, which is subject for another bug.
I expect much higher scalability impact on more powerful hardware.
Attachments
Issue Links
- is blocked by
-
MDEV-18287 Status threads_running show wrong value since 10.3
- Closed