[MDEV-14505] Threads_running becomes scalability bottleneck - Jira

XML

Word

Printable

thread_running is global shared variable, which is updated twice per query in dispatch_command() using atomic-add.

It started to appear on radar in OLTP RO benchmark. E.g here're numbers for 2 socket/20 cores/40 threads Intel Broadwell system:

   0,70%  mysqld               [.] dispatch_command

...

       │      inline void thread_safe_increment32(int32 *value)

       │      {

       │        (void) my_atomic_add32_explicit(value, 1, MY_MEMORY_ORDER_RELAXED);

  0,59 │ 680:   lea    thread_running,%rax

 16,07 │        lock   addl   $0x1,(%rax)

If we remove inc_thread_running() and dec_thread_running() we get slightly better throughput and dispatch_command() goes down in profiler:

   0,58%  mysqld               [.] dispatch_command

...

       │      /* increment query_id and return it.  */

       │      inline __attribute__((warn_unused_result)) query_id_t next_query_id()

       │      {

       │        return my_atomic_add64_explicit(&global_query_id, 1, MY_MEMORY_ORDER_RELAXED);

  0,49 │ 290:   mov    $0x1,%edx

 18,02 │        lock   xadd   %rdx,(%r14)

So bottleneck is shifted to global_query_id counter, which is subject for another bug.

I expect much higher scalability impact on more powerful hardware.

is blocked by

MDEV-18287 Status threads_running show wrong value since 10.3

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.