Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21574

MyRocks: Range Locking: RCU-based cache for the root node

Details

    Description

      Problem description

      Benchmarks show that Range-Locking mode experiences a lot of contention on the
      locktree's root node mutex.

      This is because every lock acquisition goes as follows:

      • Acquire the locktree's root node mutex.
      • Compare the root node's range with the range we intend to lock
      • Typically, the ranges do not overlap, so we need to go to the left or right child
      • Acquire the mutex of the left(or right) child node
      • Possibly do a tree rotation
      • Release the root node mutex.

      Then the process repeats for the child child node, until we either have found the node whose range overlaps with the range we intend to lock, or we have found the place where we insert the node.

      Proposed solution

      The idea is that most lock acquisitions do not actually need to modify the
      root node. They could use an RCU mechanism to access it.

      The tree will have a member variable:

      class locktree { 
        bool root_rcu_enabled;
        ...
      }
      

      Fast-path reader will do this:

      tree_node *get_child_with_rcu(lookup_key) {
        tree_node *res= nullptr;
        rcu_read_lock();
        if (locktree->root_rcu_enabled) {
          cmp = compare(lookup_key, locktree->root->key);
          if (cmp == LESS_THAN) {
            Get the mutex on locktree->root->left_child;
            res= locktree->root->left_child;
          }
          // if (cmp == GREATER_THAN) { same thing with right_child } 
        }
        rcu_read_unlock();
        return res;
      }
      

      If this function returns NULL, this means we need to touch the root node and so will retry using the locking.

      tree_node *get_root_node() {
        
        // Stop the new readers from using the RCU
        root_rcu_enabled= false;
       
        // Wait until all RCU readers are gone
        rcu.synchronize_rcu();
       
        locktree->root->mutex_lock();
       
        // Here, can make modifications to the root node
       
        // When we are done,  release the mutex and enable the RCU
        locktree->root->mutex_unlock();
        root_rcu_enabled= true;
      }
      

      Other details

      It is not clear if/how this approach could be extended to non-root nodes.

      Attachments

        1. range-lock-counters1.diff
          2 kB
        2. range-lock-counters1-b.diff
          6 kB
        3. rcu4a.diff
          5 kB
        4. rcu4b.diff
          17 kB
        5. results-feb3.tgz
          41 kB
        6. screenshot-1.png
          screenshot-1.png
          76 kB
        7. screenshot-2.png
          screenshot-2.png
          92 kB
        8. screenshot-3.png
          screenshot-3.png
          30 kB
        9. screenshot-4.png
          screenshot-4.png
          45 kB

        Issue Links

          Activity

            c5.2xlarge:

            ubuntu@ip-172-31-17-220:~/range-locking-benchmark$ ./summarize-result.sh 105-orig
            SERVER_DIR=mysql-5.6-orig
            SYSBENCH_TEST=oltp_write_only.lua
            Threads,        QPS
            1       17531.27
            5       58527.50
            10      94065.11
            20      101323.16
            40      101602.47
            60      100858.54
            80      100453.83
            100     99408.72
            ubuntu@ip-172-31-17-220:~/range-locking-benchmark$ ./summarize-result.sh 105-range-locking-base 
            SERVER_DIR=mysql-5.6-range-locking
            SYSBENCH_TEST=oltp_write_only.lua
            Threads,        QPS
            1       17777.90
            5       54689.06
            10      76271.56
            20      81926.33
            40      84046.30
            60      83086.25
            80      80931.29
            100     78644.12
            ubuntu@ip-172-31-17-220:~/range-locking-benchmark$ ./summarize-result.sh 104-rcu-rel
            SERVER_DIR=mysql-5.6-range-locking-rcu
            SYSBENCH_TEST=oltp_write_only.lua
            Threads,        QPS
            1       16099.38
            5       33605.91
            10      29440.77
            20      29464.23
            40      28729.21
            60      28841.09
            80      29461.01
            100     30389.78
            

            psergei Sergei Petrunia added a comment - c5.2xlarge: ubuntu@ip-172-31-17-220:~/range-locking-benchmark$ ./summarize-result.sh 105-orig SERVER_DIR=mysql-5.6-orig SYSBENCH_TEST=oltp_write_only.lua Threads, QPS 1 17531.27 5 58527.50 10 94065.11 20 101323.16 40 101602.47 60 100858.54 80 100453.83 100 99408.72 ubuntu@ip-172-31-17-220:~/range-locking-benchmark$ ./summarize-result.sh 105-range-locking-base SERVER_DIR=mysql-5.6-range-locking SYSBENCH_TEST=oltp_write_only.lua Threads, QPS 1 17777.90 5 54689.06 10 76271.56 20 81926.33 40 84046.30 60 83086.25 80 80931.29 100 78644.12 ubuntu@ip-172-31-17-220:~/range-locking-benchmark$ ./summarize-result.sh 104-rcu-rel SERVER_DIR=mysql-5.6-range-locking-rcu SYSBENCH_TEST=oltp_write_only.lua Threads, QPS 1 16099.38 5 33605.91 10 29440.77 20 29464.23 40 28729.21 60 28841.09 80 29461.01 100 30389.78
            psergei Sergei Petrunia added a comment - - edited

            https://docs.google.com/spreadsheets/d/1TFxlLx7rALVTuOWX8AqdVuWP9BmdxgBdLmNyZoYtbjI/edit#gid=1271981084

            This shows RCU now is much more commonly used, but the %age of the operations that use the RCU is still too low.

            Questions:

            • Is this a fundamental limitation or there's a way to increase the number of lock acquisitions/releases that do not use the RCU?
            psergei Sergei Petrunia added a comment - - edited https://docs.google.com/spreadsheets/d/1TFxlLx7rALVTuOWX8AqdVuWP9BmdxgBdLmNyZoYtbjI/edit#gid=1271981084 This shows RCU now is much more commonly used, but the %age of the operations that use the RCU is still too low. Questions: Is this a fundamental limitation or there's a way to increase the number of lock acquisitions/releases that do not use the RCU?

            Could it be that my idea about how often the root node needs to be modified is wrong?
            Added counters to check how often this happens: range-lock-counters1-b.diff range-lock-counters1.diff
            and ran the benchmark.

            Results: https://docs.google.com/spreadsheets/d/1TFxlLx7rALVTuOWX8AqdVuWP9BmdxgBdLmNyZoYtbjI/edit#gid=1704131097

            no surprises so far...

            psergei Sergei Petrunia added a comment - Could it be that my idea about how often the root node needs to be modified is wrong? Added counters to check how often this happens: range-lock-counters1-b.diff range-lock-counters1.diff and ran the benchmark. Results: https://docs.google.com/spreadsheets/d/1TFxlLx7rALVTuOWX8AqdVuWP9BmdxgBdLmNyZoYtbjI/edit#gid=1704131097 no surprises so far...
            psergei Sergei Petrunia added a comment - - edited

            Improved patch, run in c5.9xlarge (two runs with the patch, one without perf counters and one with):

            threads	orig	range-locking	range-locking-rcu	range-locking-rcu-cntr
            1	21916.79	22166.78	21960.01	21347.42
            5	98125.35	90756.97	63205.76	61617.9
            10	165994.26	143740.83	94505.42	93738.68
            20	189590.09	154530.26	114082.09	113270.92
            40	209484.58	145180.66	130328.97	129862.27
            60	213205.54	140677.04	133218.07	132456.41
            80	216881.11	138068.66	132357.3	131860.98
            100	210547.05	137074.94	131686.54	130884.23
            

            It is not as bad as before, but still it's worse than regular range locking .

            psergei Sergei Petrunia added a comment - - edited Improved patch, run in c5.9xlarge (two runs with the patch, one without perf counters and one with): threads orig range-locking range-locking-rcu range-locking-rcu-cntr 1 21916.79 22166.78 21960.01 21347.42 5 98125.35 90756.97 63205.76 61617.9 10 165994.26 143740.83 94505.42 93738.68 20 189590.09 154530.26 114082.09 113270.92 40 209484.58 145180.66 130328.97 129862.27 60 213205.54 140677.04 133218.07 132456.41 80 216881.11 138068.66 132357.3 131860.98 100 210547.05 137074.94 131686.54 130884.23 It is not as bad as before, but still it's worse than regular range locking .

            Counters for the new patch:

            https://docs.google.com/spreadsheets/d/1aApVWcYp9CogAUfho9wvH55-h4a0SkdKFAqJFNXQku0/edit#gid=0

            At high number of threads, the number of operations that use the RCU read path (and don't acquire mutex on the root node) is about 90%. With previous patch, it was 50-70%.

            If an operation (lock acquisition/release) couldn't use the RCU read path, it will get the root node's mutex. Then, it will examine the root node and arrive at one of these outcomes:

            • 1. It doesn't need to modify the root node. It will release the mutex and move to the child. Note that this path doesn't prevent others from using RCU read path
            • 2. It needs to modify (e.g. delete) the root node itself (or one of its direct children). This is counted in RCU_DISABLE_ROOT_HIT.
            • 3. It needs to modify the root node due to doing re-balance operation on the child. This is counted in RCU_DISABLE_DUE_TO_REBALANCE.

            One can see that RCU_DISABLE_DUE_TO_REBALANCE is relatively rare.
            RCU_DISABLE_ROOT_HIT is also quite infrequent.

            psergei Sergei Petrunia added a comment - Counters for the new patch: https://docs.google.com/spreadsheets/d/1aApVWcYp9CogAUfho9wvH55-h4a0SkdKFAqJFNXQku0/edit#gid=0 At high number of threads, the number of operations that use the RCU read path (and don't acquire mutex on the root node) is about 90%. With previous patch, it was 50-70%. If an operation (lock acquisition/release) couldn't use the RCU read path , it will get the root node's mutex. Then, it will examine the root node and arrive at one of these outcomes: 1. It doesn't need to modify the root node. It will release the mutex and move to the child. Note that this path doesn't prevent others from using RCU read path 2. It needs to modify (e.g. delete) the root node itself (or one of its direct children). This is counted in RCU_DISABLE_ROOT_HIT. 3. It needs to modify the root node due to doing re-balance operation on the child. This is counted in RCU_DISABLE_DUE_TO_REBALANCE. One can see that RCU_DISABLE_DUE_TO_REBALANCE is relatively rare. RCU_DISABLE_ROOT_HIT is also quite infrequent.

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.