Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33515

log_sys.lsn_lock causes excessive context switching

    XMLWordPrintable

Details

    Description

      steve.shaw@intel.com is reporting that write intensive workloads on a NUMA system end up spending a lot of time in native_queued_spin_lock_slowpath.part.0 in the Linux kernel. He has provided a patch that adds a user-space spinlock around the calls to mtr_t::do_write() and is significantly improving throughput at larger numbers of concurrent connections in his test environment.

      As far as I can tell, that patch would only allow one mtr_t::do_write() call to proceed at a time, and thus make waits on log_sys.latch extremely unlikely. But that would also seem to ruin part of what MDEV-27774 achieved.

      If I understood it correctly, the idea would be better implemented at a slightly lower level, to allow maximum concurrency:

      diff --git a/storage/innobase/mtr/mtr0mtr.cc b/storage/innobase/mtr/mtr0mtr.cc
      index b819022fec6..884bb5af5c1 100644
      --- a/storage/innobase/mtr/mtr0mtr.cc
      +++ b/storage/innobase/mtr/mtr0mtr.cc
      @@ -1052,7 +1052,7 @@ std::pair<lsn_t,mtr_t::page_flush_ahead> mtr_t::do_write()
         }
       
         if (!m_latch_ex)
      -    log_sys.latch.rd_lock(SRW_LOCK_CALL);
      +    log_sys.latch.rd_spin_lock();
       
         if (UNIV_UNLIKELY(m_user_space && !m_user_space->max_lsn &&
                           !is_predefined_tablespace(m_user_space->id)))
      

      The to-be-written member function rd_lock_spin() would avoid invoking futex_wait(), and instead keep invoking MY_RELAX_CPU() in the spin loop.

      An exclusive log_sys.latch will be acquired rarely and held for rather short time, during DDL operations, undo tablespace truncation, as well as around log checkpoints.

      Some experimentation will be needed to find something that scales well across the board (from embedded systems to high-end servers).

      Attachments

        1. 1socket.png
          1socket.png
          21 kB
        2. 2socket.png
          2socket.png
          21 kB
        3. baseline.svg
          656 kB
        4. mariadbtpm.png
          mariadbtpm.png
          84 kB
        5. spinflag.svg
          437 kB
        6. update_index_256threads_x_10tables_x_1mio_rows.svg
          131 kB

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.