Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35524

Optimizer search depth default triggers performance issues.

Details

    Description

      New description

      A query with join doing many eq_ref accesses is slower in 10.5 and 10.6 than it was in 10.4.

      The slowness comes from join optimizer considering very large number of very similar query plans (basically, different permutations of eq_ref(t1), eq_ref(t2),eq_ref(t3),...).

      Setting low optimizer_search_depth is a workaround, but it's still a workaround.

      MariaDB 10.10 works fast, much faster than even 10.4. In 10.10, the speed was improved by fix for MDEV-28073. pruned_by_hanging_leaf optimization makes the join optimizer to not consider permutations of eq_ref accesses.

      ANALYZE for tables

      Run this after loading the dataset so that behavior is deterministic:

      analyze table analyserflag           ;
      analyze table analyseridentifier     ;
      analyze table analysiscalculformula  ;
      analyze table analysisdetail         ;
      analyze table analysisresultsetting  ;
      analyze table analysissettingversion ;
      analyze table assembly               ;
      analyze table assemblydetail         ;
      analyze table assemblylocation       ;
      analyze table assemblyorder          ;
      analyze table assemblytitle          ;
      analyze table assemblytitleset       ;
      analyze table operator               ;
      analyze table polarity               ;
      analyze table resultproducer         ;
      analyze table sample                 ;
      analyze table sampletype             ;
      analyze table subunit                ;
      analyze table test                   ;
      analyze table testanalyserflag       ;
      analyze table useraccount            ;
      

      Original description

      (note: sounds odd, we've had optimizer_search_depth=62 forever and didn't change that)

      The change to default optimizer_search_depth=62 results in a performance issue in all tested 10.6 versions, affecting users upgrading from 10.4. Setting optimizer_search_depth=0 solves the problem in 10.6 environments, but in 10.11 and 11.4 environments the problem is not present at all, despite the default optimizer_search_depth=62.

      The issue happens when the search tree is larger than expected for possible execution paths. This is not too rare in shops with well-normalized data. This particular reproduction involves 24 joins on 21 tables.

      Attachments

        Activity

          psergei Sergei Petrunia added a comment - - edited

          ...
          Yes, and the cause seems to be the *pruned_by_hanging_leaf* optimization from here:

          commit b729896d00e022f6205399376c0cc107e1ee0704
          Author: Monty <monty@mariadb.org>
          Date:   Tue May 10 11:47:20 2022 +0300
           
              MDEV-28073 Query performance degradation in newer MariaDB versions when using many tables
              
              The issue was that best_extension_by_limited_search() had to go through
              too many plans with the same cost as there where many EQ_REF tables.
              
              Fixed by shortcutting EQ_REF (AND REF) when the result only contains one
              row. This got the optimization time down from hours to sub seconds.
              
              The only known downside with this patch is that in some cases a table
              with ref and 1 record may be used before on EQ_REF table. The faster
              optimzation phase should compensate for this.
          

          pruned_by_hanging_leaf is enabled unconditionally.

          psergei Sergei Petrunia added a comment - - edited ... Yes, and the cause seems to be the * pruned_by_hanging_leaf * optimization from here: commit b729896d00e022f6205399376c0cc107e1ee0704 Author: Monty <monty@mariadb.org> Date: Tue May 10 11:47:20 2022 +0300   MDEV-28073 Query performance degradation in newer MariaDB versions when using many tables The issue was that best_extension_by_limited_search() had to go through too many plans with the same cost as there where many EQ_REF tables. Fixed by shortcutting EQ_REF (AND REF) when the result only contains one row. This got the optimization time down from hours to sub seconds. The only known downside with this patch is that in some cases a table with ref and 1 record may be used before on EQ_REF table. The faster optimzation phase should compensate for this. pruned_by_hanging_leaf is enabled unconditionally.
          Gosselin Dave Gosselin added a comment -

          psergei here's an idea I had for toggling pruned_by_hanging_leaf functionality with a session var, please let me know what you think. (Obviously this is a WIP and not fully tested, etc. It restores the query time of this particular case):

          diff --git a/sql/sql_class.h b/sql/sql_class.h
          index a347700f72f..66af438bf74 100644
          --- a/sql/sql_class.h
          +++ b/sql/sql_class.h
          @@ -900,6 +900,7 @@ typedef struct system_variables
           
             vers_asof_timestamp_t vers_asof_timestamp;
             ulong vers_alter_history;
          +  my_bool prune_by_hanging_leaf;
           } SV;
           
           /**
          diff --git a/sql/sql_select.cc b/sql/sql_select.cc
          index 379cec84b1c..59a0bfbfe7b 100644
          --- a/sql/sql_select.cc
          +++ b/sql/sql_select.cc
          @@ -10645,7 +10645,8 @@ best_extension_by_limited_search(JOIN      *join,
                       check_if_edge_table(join->positions+ idx,
                                           pushdown_cond_selectivity) !=
                       SEARCH_FOUND_EDGE)
          -          best_res= SEARCH_OK;
          +          best_res= join->thd->variables.prune_by_hanging_leaf ?
          +              SEARCH_OK : SEARCH_FOUND_EDGE;
                 }
                 else
                 {
          @@ -10684,7 +10685,8 @@ best_extension_by_limited_search(JOIN      *join,
                 restore_prev_sj_state(remaining_tables, s, idx);
                 if (best_res == SEARCH_FOUND_EDGE)
                 {
          -        trace_one_table.add("pruned_by_hanging_leaf", true);
          +        trace_one_table.add("pruned_by_hanging_leaf",
          +            join->thd->variables.prune_by_hanging_leaf);
                   goto end;
                 }
               }
          diff --git a/sql/sys_vars.cc b/sql/sys_vars.cc
          index 42ec52b38a9..a43317cb3e9 100644
          --- a/sql/sys_vars.cc
          +++ b/sql/sys_vars.cc
          @@ -6879,3 +6879,11 @@ static Sys_var_ulonglong Sys_max_rowid_filter_size(
                  SESSION_VAR(max_rowid_filter_size), CMD_LINE(REQUIRED_ARG),
                  VALID_RANGE(1024, (ulonglong)~(intptr)0), DEFAULT(128*1024),
                  BLOCK_SIZE(1));
          +static Sys_var_mybool Sys_prune_by_hanging_leaf(
          +      "prune_by_hanging_leaf",
          +      "Enabled by default.  When disabled, limits the recursion on "
          +      "partial query plans when refining edge tables by taking the first "
          +      "satisfactory edge and not refining it further",
          +      SESSION_VAR(prune_by_hanging_leaf),
          +      NO_CMD_LINE,
          +      DEFAULT(TRUE));
          

          Gosselin Dave Gosselin added a comment - psergei here's an idea I had for toggling pruned_by_hanging_leaf functionality with a session var, please let me know what you think. (Obviously this is a WIP and not fully tested, etc. It restores the query time of this particular case): diff --git a/sql/sql_class.h b/sql/sql_class.h index a347700f72f..66af438bf74 100644 --- a/sql/sql_class.h +++ b/sql/sql_class.h @@ -900,6 +900,7 @@ typedef struct system_variables   vers_asof_timestamp_t vers_asof_timestamp; ulong vers_alter_history; + my_bool prune_by_hanging_leaf; } SV;   /** diff --git a/sql/sql_select.cc b/sql/sql_select.cc index 379cec84b1c..59a0bfbfe7b 100644 --- a/sql/sql_select.cc +++ b/sql/sql_select.cc @@ -10645,7 +10645,8 @@ best_extension_by_limited_search(JOIN *join, check_if_edge_table(join->positions+ idx, pushdown_cond_selectivity) != SEARCH_FOUND_EDGE) - best_res= SEARCH_OK; + best_res= join->thd->variables.prune_by_hanging_leaf ? + SEARCH_OK : SEARCH_FOUND_EDGE; } else { @@ -10684,7 +10685,8 @@ best_extension_by_limited_search(JOIN *join, restore_prev_sj_state(remaining_tables, s, idx); if (best_res == SEARCH_FOUND_EDGE) { - trace_one_table.add("pruned_by_hanging_leaf", true); + trace_one_table.add("pruned_by_hanging_leaf", + join->thd->variables.prune_by_hanging_leaf); goto end; } } diff --git a/sql/sys_vars.cc b/sql/sys_vars.cc index 42ec52b38a9..a43317cb3e9 100644 --- a/sql/sys_vars.cc +++ b/sql/sys_vars.cc @@ -6879,3 +6879,11 @@ static Sys_var_ulonglong Sys_max_rowid_filter_size( SESSION_VAR(max_rowid_filter_size), CMD_LINE(REQUIRED_ARG), VALID_RANGE(1024, (ulonglong)~(intptr)0), DEFAULT(128*1024), BLOCK_SIZE(1)); +static Sys_var_mybool Sys_prune_by_hanging_leaf( + "prune_by_hanging_leaf", + "Enabled by default. When disabled, limits the recursion on " + "partial query plans when refining edge tables by taking the first " + "satisfactory edge and not refining it further", + SESSION_VAR(prune_by_hanging_leaf), + NO_CMD_LINE, + DEFAULT(TRUE));
          psergei Sergei Petrunia added a comment - - edited

          Backported Optimizer_join_prefixes_check_calls from 10.11 into 10.6, 10.5-before-slowdown and 10.5-after-slowdown.

          10.5-after-slowdown

          1 min 51.913 sec
          +-------------------------------------+----------+
          | Variable_name                       | Value    |
          +-------------------------------------+----------+
          | Optimizer_join_prefixes_check_calls | 56 331 728 |
          +-------------------------------------+----------+
          

          10.5-before-slowdown

          8.509 sec
          MariaDB [test]> show status like '%prefix%';
          +-------------------------------------+---------+
          | Variable_name                       | Value   |
          +-------------------------------------+---------+
          | Optimizer_join_prefixes_check_calls | 4 623 141 |
          +-------------------------------------+---------+
          

          10.6.21(slow)

          8 min 48.014 sec
          +-------------------------------------+---------+
          | Variable_name                       | Value   |
          +-------------------------------------+---------+
          | Optimizer_join_prefixes_check_calls | 251 178 443 |
          +-------------------------------------+---------+
          

          Note that 10.6 has pruned_by_hanging_leaf optimization and I can see it fire, but it doesn't help.
          Actually, 10.6 is even slower than 10.5-after-slowdown.

          psergei Sergei Petrunia added a comment - - edited Backported Optimizer_join_prefixes_check_calls from 10.11 into 10.6, 10.5-before-slowdown and 10.5-after-slowdown. 10.5-after-slowdown 1 min 51.913 sec +-------------------------------------+----------+ | Variable_name | Value | +-------------------------------------+----------+ | Optimizer_join_prefixes_check_calls | 56 331 728 | +-------------------------------------+----------+ 10.5-before-slowdown 8.509 sec MariaDB [test]> show status like '%prefix%'; +-------------------------------------+---------+ | Variable_name | Value | +-------------------------------------+---------+ | Optimizer_join_prefixes_check_calls | 4 623 141 | +-------------------------------------+---------+ 10.6.21(slow) 8 min 48.014 sec +-------------------------------------+---------+ | Variable_name | Value | +-------------------------------------+---------+ | Optimizer_join_prefixes_check_calls | 251 178 443 | +-------------------------------------+---------+ Note that 10.6 has pruned_by_hanging_leaf optimization and I can see it fire, but it doesn't help. Actually, 10.6 is even slower than 10.5-after-slowdown.

          In 10.11, the big difference is that we do linked eq_ref optimization at any stage in the query while in 10.6 we only ignore any combinations of eq_ref at the end of the query plan.
          This is done in get_costs_for_tables() that also sort tables according to cost, which also makes things faster.

          monty Michael Widenius added a comment - In 10.11, the big difference is that we do linked eq_ref optimization at any stage in the query while in 10.6 we only ignore any combinations of eq_ref at the end of the query plan. This is done in get_costs_for_tables() that also sort tables according to cost, which also makes things faster.

          Backported Optimizer_join_prefixes_check_calls from 10.11 into 10.6, 10.5

          The patch is here: psergey-10.5-optimizer_join_prefixes_check_calls.diff -

          psergei Sergei Petrunia added a comment - Backported Optimizer_join_prefixes_check_calls from 10.11 into 10.6, 10.5 The patch is here: psergey-10.5-optimizer_join_prefixes_check_calls.diff -

          People

            monty Michael Widenius
            juan.vera Juan
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.