Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5081

Simple performance improvement for MariaDB

Details

    • Task
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Fixed
    • 5.5.38, 10.0.11
    • None
    • None

    Description

      MariaDB developers:
      Here's a simple performance improvement I found in MariaDB (v5.5.31) while analyzing sysbench on my 4-node system.
      It improves the sysbench oltp test by 3% to 17%, depending on the number of threads specified (and I'm sure there's some noise).

      The patch is attached to this message. It reduces the memory accesses to the "spins" and "rng_state" fields of the my_pthread_fast_mutex_t struct.

      typedef struct st_my_pthread_fastmutex_t

      { pthread_mutex_t mutex; uint spins; uint rng_state; }

      my_pthread_fastmutex_t;

      As I'm sure you know, the mutex in that struct is very hot. Since it's accessed by cpus on all nodes, a lot of time is wasted tugging the cacheline back-n-forth between numa nodes.

      I noticed the code is repeatedly accessing the "spins" and "rng_state" fields when looping trying to get the mutex. Since those fields reside in the same cacheline as the mutex, and since their accesses come from all cpus on all numa nodes, they were contributing to making the mutex slower (because they increased the cache-to-cache contention between nodes).

      My change is simply to keep the values for "spins" and "rng_state" in local variables (a register) as long as possible and only update their values in memory when necessary. I didn't change anything in the algorithm.

      The rest of this msg shows the improvement in sysbench transaction values for different thread counts.

      Let me know if you have any questions. Since I'm not on the mailing list, please cc me on any reply.

      Joe Mario

      1. sysbench --test=oltp --num-threads=12 --max-requests=1000000 --max-time=100 run

      5.5.31-MariaDB 5.5.31-MariaDB-Modified
      -------------- -----------------------
      Thread cnt:12
      transactions: 572694 (5726.83 per sec.) 589543 (5895.34 per sec.) 2.94% speedup.
      transactions: 564215 (5642.05 per sec.) 582254 (5822.43 per sec.) 3.20% speedup.
      transactions: 565231 (5652.21 per sec.) 583228 (5832.19 per sec.) 3.18% speedup.

      Thread cnt:20
      transactions: 507300 (5072.82 per sec.) 580229 (5802.09 per sec.) 14.38% speedup.
      transactions: 509373 (5093.60 per sec.) 585629 (5856.09 per sec.) 14.97% speedup.
      transactions: 497711 (4976.89 per sec.) 583506 (5834.94 per sec.) 17.24% speedup.

      Thread cnt:30
      transactions: 369979 (3699.66 per sec.) 410698 (4106.74 per sec.) 11.01% speedup.
      transactions: 372194 (3721.70 per sec.) 412884 (4128.65 per sec.) 10.93% speedup.

      Thread cnt:40
      transactions: 366285 (3662.60 per sec.) 401050 (4010.23 per sec.) 9.49% speedup.
      transactions: 369626 (3696.02 per sec.) 401913 (4018.88 per sec.) 8.74% speedup.

      Thread cnt:50
      transactions: 357529 (3574.99 per sec.) 389759 (3897.25 per sec.) 9.01% speedup.
      transactions: 357116 (3570.83 per sec.) 387115 (3870.80 per sec.) 8.40% speedup.

      Thread cnt:60
      transactions: 335427 (3353.88 per sec.) 375134 (3750.91 per sec.) 11.84% speedup.
      transactions: 334128 (3340.90 per sec.) 359116 (3590.78 per sec.) 7.48% speedup.

      I've attached the patch, since it got mangled when I tried to insert it here.

      Joe

      Attachments

        1. base_vs_joechanges.txt
          4 kB
        2. maria_perf.patch
          2 kB
        3. mdev5081.patch
          2 kB
        4. mdev5081.pdf
          16 kB
        5. reply_to_sergey.txt
          5 kB

        Activity

          I was able to reproduce reported problem. Fast mutexes shown worst throughput compared to other mutex types. Benchmark results are available here:
          http://svoj-db.blogspot.ru/2014/02/mariadb-mutexes-scalability.html

          Looks like this problem was already raised a few times in MySQL circles:
          http://bugs.mysql.com/bug.php?id=58766
          http://bugs.mysql.com/bug.php?id=38941
          http://dev.mysql.com/worklog/task/?id=4601

          Said the above fast mutexes will unlikely scale better than normal mutexes ever. We agreed to disable fast mutexes in our release build configuration.

          svoj Sergey Vojtovich added a comment - I was able to reproduce reported problem. Fast mutexes shown worst throughput compared to other mutex types. Benchmark results are available here: http://svoj-db.blogspot.ru/2014/02/mariadb-mutexes-scalability.html Looks like this problem was already raised a few times in MySQL circles: http://bugs.mysql.com/bug.php?id=58766 http://bugs.mysql.com/bug.php?id=38941 http://dev.mysql.com/worklog/task/?id=4601 Said the above fast mutexes will unlikely scale better than normal mutexes ever. We agreed to disable fast mutexes in our release build configuration.

          Sergei, please review fix for this bug.

          svoj Sergey Vojtovich added a comment - Sergei, please review fix for this bug.
          JoeMario Joe Mario added a comment -

          Hi Sergey and Sergei:
          The patches to add the cacheline tugging detection to the perf tool (perf c2c) were recently submitted upstream. See http://lwn.net/Articles/585195/.
          They are still in review with some cleanup to be added, but it's moving forward.

          If I get a chance, I'll take the version of MariaDB that's part of RHEL, run "perf c2c" on it during a sysbench run, and will post the output here so you can see what the tool is showing.

          Joe

          JoeMario Joe Mario added a comment - Hi Sergey and Sergei: The patches to add the cacheline tugging detection to the perf tool (perf c2c) were recently submitted upstream. See http://lwn.net/Articles/585195/ . They are still in review with some cleanup to be added, but it's moving forward. If I get a chance, I'll take the version of MariaDB that's part of RHEL, run "perf c2c" on it during a sysbench run, and will post the output here so you can see what the tool is showing. Joe

          ok to push

          serg Sergei Golubchik added a comment - ok to push

          Fixed in 5.5.38:

          revno: 4174
          revision-id: svoj@mariadb.org-20140228114602-nyj6i2fejiywnhbx
          parent: monty@mariadb.org-20140503161217-ac6ec1uoq5sdg40o
          committer: Sergey Vojtovich <svoj@mariadb.org>
          branch nick: 5.5-mdev5081
          timestamp: Fri 2014-02-28 15:46:02 +0400
          message:
            MDEV-5081 - Simple performance improvement for MariaDB
           
            Currently fast mutexes have lower throuput compared to normal mutexes.
            Remove them from release build configuration.

          Joe, thanks for the c2c tool link. I will try to make use of it in further benchmark.

          svoj Sergey Vojtovich added a comment - Fixed in 5.5.38: revno: 4174 revision-id: svoj@mariadb.org-20140228114602-nyj6i2fejiywnhbx parent: monty@mariadb.org-20140503161217-ac6ec1uoq5sdg40o committer: Sergey Vojtovich <svoj@mariadb.org> branch nick: 5.5-mdev5081 timestamp: Fri 2014-02-28 15:46:02 +0400 message: MDEV-5081 - Simple performance improvement for MariaDB   Currently fast mutexes have lower throuput compared to normal mutexes. Remove them from release build configuration. Joe, thanks for the c2c tool link. I will try to make use of it in further benchmark.

          People

            svoj Sergey Vojtovich
            JoeMario Joe Mario
            Votes:
            2 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.