Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-761

LP:1010395 - Optimizer pruning chooses very inefficient join plan when all joined tables are similar

Details

    • Bug
    • Status: Open (View Workflow)
    • Minor
    • Resolution: Unresolved
    • 5.1.67, 5.2.14, 5.3.12, 5.5.36, 10.0.9
    • 5.5(EOL)
    • None

    Description

      The following test case shows that if join optimization pruning heuristics
      is ON (the default behavior in all MariaDB/MySQL) version, the optimizer
      misses the optimal query plan and chooses a very inefficient query plan.

      CREATE TABLE t1 (a INT, KEY(a));
      INSERT INTO t1 VALUES (7),(5),(1),(204),(224),(9),(5),(0),(3);
      INSERT INTO t1 SELECT al1.* FROM t1 al1, t1 al2;
       
      -- default prune level is 1
      set optimizer_prune_level=1;
       
      EXPLAIN SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a;
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref         | rows | Extra                          |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      |  1 | SIMPLE      | al4   | index | a             | a    | 5       | NULL        |   90 | Using index                    |
      |  1 | SIMPLE      | al3   | ref   | a             | a    | 5       | md321.al4.a |   10 | Using where; Using index       |
      |  1 | SIMPLE      | al2   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al1   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      SHOW SESSION STATUS LIKE 'Last_query_cost';  
      +-----------------+----------------+
      | Variable_name   | Value          |
      +-----------------+----------------+
      | Last_query_cost | 1531029.266998 |
      +-----------------+----------------+

      Above, if we order the tables in the FROM clause according to their optimal order, the optimizer
      finds the optimal plan.

      Below, if the tables are in reverse order, then the optimizer misses the optimal order due to pruning:

      EXPLAIN SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a;
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref         | rows | Extra                          |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      |  1 | SIMPLE      | al1   | index | NULL          | a    | 5       | NULL        |   90 | Using index                    |
      |  1 | SIMPLE      | al2   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al3   | index | a             | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al4   | ref   | a             | a    | 5       | md321.al3.a |   10 | Using where; Using index       |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      SHOW SESSION STATUS LIKE 'Last_query_cost';
      +-----------------+----------------+
      | Variable_name   | Value          |
      +-----------------+----------------+
      | Last_query_cost | 2420964.599961 |
      +-----------------+----------------+

      As shown below, if we switch off the pruning heuristics, the join optimizer finds the optimal
      plan in all cases.

      -- switch off pruning heuristics
      set optimizer_prune_level=0;
       
      EXPLAIN SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a;
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref         | rows | Extra                          |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      |  1 | SIMPLE      | al4   | index | a             | a    | 5       | NULL        |   90 | Using index                    |
      |  1 | SIMPLE      | al3   | ref   | a             | a    | 5       | md321.al4.a |   10 | Using where; Using index       |
      |  1 | SIMPLE      | al2   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al1   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      SHOW SESSION STATUS LIKE 'Last_query_cost';  
      +-----------------+----------------+
      | Variable_name   | Value          |
      +-----------------+----------------+
      | Last_query_cost | 1531029.266998 |
      +-----------------+----------------+

      EXPLAIN SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a;
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref         | rows | Extra                          |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      |  1 | SIMPLE      | al3   | index | a             | a    | 5       | NULL        |   90 | Using index                    |
      |  1 | SIMPLE      | al4   | ref   | a             | a    | 5       | md321.al3.a |   10 | Using where; Using index       |
      |  1 | SIMPLE      | al2   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al1   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      SHOW SESSION STATUS LIKE 'Last_query_cost';
      +-----------------+----------------+
      | Variable_name   | Value          |
      +-----------------+----------------+
      | Last_query_cost | 1531029.266998 |
      +-----------------+----------------+

      Notice that since al3 and al4 are exactly the same, the order <al3, al4> is the same as <al4, al3>.

      Attachments

        Activity

          Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar
          Test case:

          CREATE TABLE t1 (a INT, KEY(a));
          INSERT INTO t1 VALUES (7),(5),(1),(204),(224),(9),(5),(0),(3);
           
          INSERT INTO t1 SELECT al1.* FROM t1 al1, t1 al2;
           
          # default prune level is 1
          set optimizer_prune_level=1;
           
          EXPLAIN
          SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a;
          SHOW SESSION STATUS LIKE 'Last_query_cost';  
           
          EXPLAIN
          SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a;
          SHOW SESSION STATUS LIKE 'Last_query_cost';
           
          # switch off pruning heuristics
          set optimizer_prune_level=0;
           
          EXPLAIN
          SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a;
          SHOW SESSION STATUS LIKE 'Last_query_cost';  
           
          EXPLAIN
          SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a;
          SHOW SESSION STATUS LIKE 'Last_query_cost';

          timour Timour Katchaounov (Inactive) added a comment - - edited Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar Test case: CREATE TABLE t1 (a INT , KEY (a)); INSERT INTO t1 VALUES (7),(5),(1),(204),(224),(9),(5),(0),(3);   INSERT INTO t1 SELECT al1.* FROM t1 al1, t1 al2;   # default prune level is 1 set optimizer_prune_level=1;   EXPLAIN SELECT MAX (al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a; SHOW SESSION STATUS LIKE 'Last_query_cost' ;   EXPLAIN SELECT MAX (al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a; SHOW SESSION STATUS LIKE 'Last_query_cost' ;   # switch off pruning heuristics set optimizer_prune_level=0;   EXPLAIN SELECT MAX (al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a; SHOW SESSION STATUS LIKE 'Last_query_cost' ;   EXPLAIN SELECT MAX (al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a; SHOW SESSION STATUS LIKE 'Last_query_cost' ;

          Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar
          The bug is reproducible in any MariaDB/MySQL version since at least MariaDB 5.2.
          This is a legacy bug that has existed since at least 2005, therefore I set its importance
          to "medium".

          timour Timour Katchaounov (Inactive) added a comment - Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar The bug is reproducible in any MariaDB/MySQL version since at least MariaDB 5.2. This is a legacy bug that has existed since at least 2005, therefore I set its importance to "medium".

          Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar
          According to Igor on IRC:
          1. when pruning we should take into account the cardinality of the partial join, not only its cost. this will require to change one condition in the code.
          2. jorgen changed initial sorting to improve plans when pruning set to default (in MySQL)

          timour Timour Katchaounov (Inactive) added a comment - Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar According to Igor on IRC: 1. when pruning we should take into account the cardinality of the partial join, not only its cost. this will require to change one condition in the code. 2. jorgen changed initial sorting to improve plans when pruning set to default (in MySQL)

          Launchpad bug id: 1010395

          ratzpo Rasmus Johansson (Inactive) added a comment - Launchpad bug id: 1010395

          Also reproducible on MySQL 5.1-5.7.

          elenst Elena Stepanova added a comment - Also reproducible on MySQL 5.1-5.7.

          People

            Unassigned Unassigned
            timour Timour Katchaounov (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.