Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28073

Slow query performance in MariaDB when using many tables

Details

    Description

      Looking for root cause for the following problem:
      A query which comes from a generated report used to take seconds and now takes 21 min (tested in house on MariaDB ES 10.6.5, in MySQL 5.6 it took 9.611s Customer is having the issue in 10.2.27 .)

      The customer found that by setting the optimizer_search_depth=10 the amount of time is similar to original run times. They tested 11 through 15 as well. They noticed `optimizer_search_depth=10` performed identically to `optimizer_search_depth=0` and `optimizer_search_depth=7`. However, `optimizer_search_depth=11` started to show degraded performance. They saw that at `optimizer_search_depth=15`, there was no, discernible difference between that and the default `optimizer_search_depth=62`.

      Below are the steps to reproduce the issue.
      The files are included in CS0379791-March102022.tar.gz in the case.
      1. Provision MariaDB enterprise 10.6.5-2 (customer had issue with 10.2 and 10.4 DB) using server.cnf.txt (modified version of customers xpo_testcase_my.cnf).
      2. Connect to MariaDB instance.
      3. create database discard;
      4. use discard;
      5. source xpo_full_testcase.sql;
      6. source xpo_testcase_explain.sql.

      Optimizer trace with default optimzer_search_depth takes 12 hours and we were unable to capture the data.
      However, setting optimizer_search_depth to 3 allowed the capture data from an optimizer trace see optimizer_output_depth3 .

      Relevant files are attached to case CS0379791 as CS0379791-March102022.tar.gz
      1). optimizer_search_depth_scaling.png - graph of time is takes to run the query when setting optimizer_search_depth = [1-25]
      2). server.cnf.txt - config file used by support when reproducing the issue
      3). xpo_testcase_my.cnf - config file usde by customer to reproduce the issue
      4). xpo_testcase_explain.sql - explain of the query in question
      5). xpo_full_testcase.sql - data dump to reproduce the issue with
      6). optimizer_output_depth3 - optimzer trace with optimizer_search_depth = 3
      7). CS0379791_output/ - flamegraph output including perf output and more
      8). explain_json.txt - Additionally the customer has attached output from explain json from customer

      Attachments

        Issue Links

          Activity

            I am looking at this now.
            What is unique with this query is:

            • 24 different tables
            • 7 using 'ref' access (with one expected row), one using 'ALL' and the 16 using eq_ref

            This means that almost all combinations has the same cost, which means that the optimizer is going trough roughly 24! = 6.2E+23 combinations.

            After discussing this with Sergei Petrunia, we come up with the following ideas to speed up the optimizer stage:

            • When searching for combinations, if we find a set of tables at the end with 'eq_ref', skip back to the first not eq_ref table and continue searching from there.
            • After going trough tables, remember the lowest possible cost for adding the table. When checking if we should cut of the search, add the minimum possible cost for the renaming tables to the current cost when comparing with the current best cost.

            This should cut down the number of combinations notable.

            monty Michael Widenius added a comment - I am looking at this now. What is unique with this query is: 24 different tables 7 using 'ref' access (with one expected row), one using 'ALL' and the 16 using eq_ref This means that almost all combinations has the same cost, which means that the optimizer is going trough roughly 24! = 6.2E+23 combinations. After discussing this with Sergei Petrunia, we come up with the following ideas to speed up the optimizer stage: When searching for combinations, if we find a set of tables at the end with 'eq_ref', skip back to the first not eq_ref table and continue searching from there. After going trough tables, remember the lowest possible cost for adding the table. When checking if we should cut of the search, add the minimum possible cost for the renaming tables to the current cost when comparing with the current best cost. This should cut down the number of combinations notable.

            A comment on the current patch:
            All queries that has a several EQ_REF joins (joins on primary or unique keys which all key parts usable) should be faster. The more EQ_REF joins, the better performance when choosing plan.

            I am now working on several new ideas of how to get down the combinations checked by the optimizer. Some of them will be in 10.6,
            the others will be in 10.9 or 10.10. Will update this ticket with more information in 1-2 weeks

            monty Michael Widenius added a comment - A comment on the current patch: All queries that has a several EQ_REF joins (joins on primary or unique keys which all key parts usable) should be faster. The more EQ_REF joins, the better performance when choosing plan. I am now working on several new ideas of how to get down the combinations checked by the optimizer. Some of them will be in 10.6, the others will be in 10.9 or 10.10. Will update this ticket with more information in 1-2 weeks
            monty Michael Widenius added a comment - - edited

            I am almost done with this task. The current work consist of:

            • Fixed an issue with table order getting scrambled. This allows the greedy_optimizer to find good plans faster in the
              case that smaller tables should be used first (common case).
            • Improved table pruning in optimizer by updating the current key_dependency as part of calculating plans.
              This allows us to prune some plans earlier that we couldn't do before (as we didn't know before if all keys where already usable)
            • Improve pruning in greedy_search by sorting tables during search
              This allows us to find good plans faster, which allows us to skip more bad plans
            • Add eq_ref optimization (something similar as MySQL has but totally different approach). This chains EQ_REF tables (tables that can use an unique key to find a row) when doing optimization, which significantly reduces search combinations. This is optimization is enabled when setting
              optimizer_prune_level=2 (which is now the new default).

            The effect of all this is a significant improvement of the speed of greedy_optimizer when table pruning is enabled.
            I did a quick comparison with MySQL 5.7 (which has eq_ref chaining optimization) by running greedy_search.test on the resulting code:
            Test for optimized builds:
            --optimizer_trace
            Test for optimized builds:
            mysql: main.greedy_optimizer 1956 ms
            mariadb: main.greedy_optimizer prune_level=1 1773 ms
            mariadb: main.greedy_optimizer prune_level=2 1369 ms

            In MariaDB 10.6 default, the above tests takes a VERY long time.

            Of the above work, the first 2 parts of the work are scheduled to be added to MariaDB 10.6 and the last works are scheduled for 10.10

            You can find this work in bb-10.6-monty (until it is pushed to 10.6 and 10.10)

            monty Michael Widenius added a comment - - edited I am almost done with this task. The current work consist of: Fixed an issue with table order getting scrambled. This allows the greedy_optimizer to find good plans faster in the case that smaller tables should be used first (common case). Improved table pruning in optimizer by updating the current key_dependency as part of calculating plans. This allows us to prune some plans earlier that we couldn't do before (as we didn't know before if all keys where already usable) Improve pruning in greedy_search by sorting tables during search This allows us to find good plans faster, which allows us to skip more bad plans Add eq_ref optimization (something similar as MySQL has but totally different approach). This chains EQ_REF tables (tables that can use an unique key to find a row) when doing optimization, which significantly reduces search combinations. This is optimization is enabled when setting optimizer_prune_level=2 (which is now the new default). The effect of all this is a significant improvement of the speed of greedy_optimizer when table pruning is enabled. I did a quick comparison with MySQL 5.7 (which has eq_ref chaining optimization) by running greedy_search.test on the resulting code: Test for optimized builds: --optimizer_trace Test for optimized builds: mysql: main.greedy_optimizer 1956 ms mariadb: main.greedy_optimizer prune_level=1 1773 ms mariadb: main.greedy_optimizer prune_level=2 1369 ms In MariaDB 10.6 default, the above tests takes a VERY long time. Of the above work, the first 2 parts of the work are scheduled to be added to MariaDB 10.6 and the last works are scheduled for 10.10 You can find this work in bb-10.6-monty (until it is pushed to 10.6 and 10.10)

            Commit https://github.com/MariaDB/server/commit/b729896d00e was pushed and released in 10.6.8.

            As far as I understand, it fixes some cases. Other cases will be fixed in 10.10, to avoid confusion they should get a new MDEV number.

            monty, that case that rob.schwyzer@mariadb.com is talking about — was it supposed to be fixed in 10.6 or in 10.10?

            serg Sergei Golubchik added a comment - Commit https://github.com/MariaDB/server/commit/b729896d00e was pushed and released in 10.6.8. As far as I understand, it fixes some cases. Other cases will be fixed in 10.10, to avoid confusion they should get a new MDEV number. monty , that case that rob.schwyzer@mariadb.com is talking about — was it supposed to be fixed in 10.6 or in 10.10?

            The code in bb-10.6-monty# was updated since we spoked last time with the changes that has already gone into 10.6 (which is just some small improvements to greedy_search() as list in my post above). This is why optimizer_prune_level could not be changed.
            The big changes are in bb-10.10-monty, which is in QA testing and will be pushed to 10.10 as soon as QA gives a green light.

            The state of the original test case:
            In current 10.6 (that doesn't have optimizer_prune_level=2)
            source xpo_testcase_explain.sql takes 0.092 second with default search_depth 62 and 0.032 seconds with search_depth 16
            In bb-10.10-monty with default settings (prune_level=2, search_dept 62):
            source xpo_testcase_explain.sql takes 0.026 seconds

            Note that in 10.6, the main reason it is faster than before was the EQ_REF shortcuting that was done in April. This only solves special cases where there are a lot of EQ_REF tables. The code in 10.10 is more general and can solve a wider range of cases.

            monty Michael Widenius added a comment - The code in bb-10.6-monty# was updated since we spoked last time with the changes that has already gone into 10.6 (which is just some small improvements to greedy_search() as list in my post above). This is why optimizer_prune_level could not be changed. The big changes are in bb-10.10-monty, which is in QA testing and will be pushed to 10.10 as soon as QA gives a green light. The state of the original test case: In current 10.6 (that doesn't have optimizer_prune_level=2) source xpo_testcase_explain.sql takes 0.092 second with default search_depth 62 and 0.032 seconds with search_depth 16 In bb-10.10-monty with default settings (prune_level=2, search_dept 62): source xpo_testcase_explain.sql takes 0.026 seconds Note that in 10.6, the main reason it is faster than before was the EQ_REF shortcuting that was done in April. This only solves special cases where there are a lot of EQ_REF tables. The code in 10.10 is more general and can solve a wider range of cases.
            Note

            Some optimizations related to the original test case were implemented in 10.6.8 (10.7.4, 10.8.3), some in 10.6.9 (10.7.5, 10.8.4).
            But the issue was completely resolved only in 10.10.1 (MDEV-28852).

            serg Sergei Golubchik added a comment - Note Some optimizations related to the original test case were implemented in 10.6.8 (10.7.4, 10.8.3), some in 10.6.9 (10.7.5, 10.8.4). But the issue was completely resolved only in 10.10.1 ( MDEV-28852 ).

            People

              monty Michael Widenius
              mpflaum Maria M Pflaum (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.