Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27889

Make table cache mutex contention threshold configurable

Details

    Description

      Currently there is a check for lock table cache mutex contention that considers an instance to be contested if more than 20000 mutex acquisitions cannot be immediately serviced before 80000 are immediately serviced. The comments in the code indicate that this is because of the estimated 100K queries per second is the maximum. These values are hard coded based on 2 socket / 20 core / 40 thread Intel Broadwell systems with a comment that these numbers may need to be adjusted for other systems. Broadwell was released in 2015 so these numbers are likely very dated.

      In our scalability testing we found the table cache mutex contention at 20% sometimes took a couple of tests to get to a steady state. Before then the performance was lower. This could mean when doing benchmarks that people may see an artificially lower number on initial tests than after waiting for a couple of runs. When we set this hard-coded value to 10K misses before 90k hits, our systems ramp up to the setting chosen for the number of table_open_cache_instances more quickly, removing this bottleneck in the first test.

      We suggest making this value configurable so it can be adjusted for the system being used.

      Attachments

        Activity

          danblack, technically there're no hardware assumptions in the original approach. I'd say time based solution is going to be even more tied to hardware.

          What the comment says about Broadwell is more like "it was tested with" rather than "it was optimised for".

          Can it happen so that the algorithm came to be not sensitive enough for the HammerDB load rather than for specific hardware?

          To sum up: I have no good answer without deeper analysis.

          svoj Sergey Vojtovich added a comment - danblack , technically there're no hardware assumptions in the original approach. I'd say time based solution is going to be even more tied to hardware. What the comment says about Broadwell is more like "it was tested with" rather than "it was optimised for". Can it happen so that the algorithm came to be not sensitive enough for the HammerDB load rather than for specific hardware? To sum up: I have no good answer without deeper analysis.

          Table cache auto-adjusting to the load was implemented precisely to avoid adding a variable for table cache instances. Which is good for benchmarks, but doesn't adapt to the real world use case and makes MariaDB even more difficult to configure.

          Adding a new variable to manually control the automatic behavior goes directly against this ease of use concept, if we'd wanted a new variable, we'd added a variable to specify the number of table cache instances.

          16 minutes to adjust could be a lot. What are other warm-up times that you see? How long does it take to load filesystem caches? InnoDB buffer pool? How big is it and what's your dataset size? How many tables?

          Anecdotal evidences claim it can take up to few hours to warm up a big innodb buffer tool.

          serg Sergei Golubchik added a comment - Table cache auto-adjusting to the load was implemented precisely to avoid adding a variable for table cache instances. Which is good for benchmarks, but doesn't adapt to the real world use case and makes MariaDB even more difficult to configure. Adding a new variable to manually control the automatic behavior goes directly against this ease of use concept, if we'd wanted a new variable, we'd added a variable to specify the number of table cache instances. 16 minutes to adjust could be a lot. What are other warm-up times that you see? How long does it take to load filesystem caches? InnoDB buffer pool? How big is it and what's your dataset size? How many tables? Anecdotal evidences claim it can take up to few hours to warm up a big innodb buffer tool.

          So what is the guidance here? Sounds like a resounding "no" to this in concept. I can certainly test out the updated patch. I think that allowing this adjustment doesn't remove the auto adjustment, right? Just how quickly it adjusts.

          Sergei, I am confused by this statement: "if we'd wanted a new variable, we'd added a variable to specify the number of table cache instances."
          There is such a variable: table_open_cache_instances

          jeepeterson Joseph Peterson added a comment - So what is the guidance here? Sounds like a resounding "no" to this in concept. I can certainly test out the updated patch. I think that allowing this adjustment doesn't remove the auto adjustment, right? Just how quickly it adjusts. Sergei, I am confused by this statement: "if we'd wanted a new variable, we'd added a variable to specify the number of table cache instances." There is such a variable: table_open_cache_instances

          Daniel, the patch you provided to use time instead of a counter showed a 2% performance boost when using HammerDB. It appears to ramp up quickly even without changing the threshold to 10%. I suspect that the default 10s timer is allowing it to ramp up quicker than the 20K misses of 100K.

          jeepeterson Joseph Peterson added a comment - Daniel, the patch you provided to use time instead of a counter showed a 2% performance boost when using HammerDB. It appears to ramp up quickly even without changing the threshold to 10%. I suspect that the default 10s timer is allowing it to ramp up quicker than the 20K misses of 100K.

          table_open_cache_instances controls the max number of instances. I meant that we'd need another variable to specify the current (or initial) value of caches.

          I don't think it's a resounding "no". Ideally, I believe, all data structures should warm up in about the same time. That's why I asked about what you are seeing regarding filesystem caches and InnoDB buffer pool.

          If table cache is an outlier — if it adjusts too slow, we definitely should fix it.

          serg Sergei Golubchik added a comment - table_open_cache_instances controls the max number of instances. I meant that we'd need another variable to specify the current (or initial) value of caches. I don't think it's a resounding "no". Ideally, I believe, all data structures should warm up in about the same time. That's why I asked about what you are seeing regarding filesystem caches and InnoDB buffer pool. If table cache is an outlier — if it adjusts too slow, we definitely should fix it.

          People

            Unassigned Unassigned
            jeepeterson Joseph Peterson
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.