Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10006

optimizer doesn't convert outer join to inner on views with WHERE clause

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.1.11
    • 10.1.15
    • Optimizer
    • None

    Description

      The optimizer can't reorder tables for OUTER JOINs, but when a later WHERE condition on a column of an outer table requires a non-NULL value it can convert the OUTER join into an INNER one, and then it can also make use of re-ordering.

      The same works when the OUTER JOIN is inside a view definition, and a WHERE condition on a column of the outer table is applied on the view ... but apparently only if the view definition doesn't have a WHERE clause.

      How to reproduce:

      DROP TABLE IF EXISTS t1;
       
      CREATE TABLE t1(i1 int primary key, v1 int, key(v1)) engine=myisam;
       
      INSERT INTO t1 VALUES (1, 1);
      INSERT INTO t1 VALUES (2, 2);
      INSERT INTO t1 VALUES (3, 3);
      INSERT INTO t1 VALUES (4, 4);
      INSERT INTO t1 VALUES (5, 3);
      INSERT INTO t1 VALUES (6, 6);
      INSERT INTO t1 VALUES (7, 7);
      INSERT INTO t1 VALUES (8, 8);
      INSERT INTO t1 VALUES (9, 9);
       
      DROP TABLE IF EXISTS t2;
       
      CREATE TABLE t2(i2 int primary key, v2 int, key(v2)) engine=myisam;
       
      INSERT INTO t2 VALUES (1, 1);
      INSERT INTO t2 VALUES (2, 2);
      INSERT INTO t2 VALUES (3, 3);
      INSERT INTO t2 VALUES (4, 4);
      INSERT INTO t2 VALUES (5, 3);
      INSERT INTO t2 VALUES (6, 6);
      INSERT INTO t2 VALUES (7, 7);
      INSERT INTO t2 VALUES (8, 8);
      INSERT INTO t2 VALUES (9, 9);
       
      DROP TABLE IF EXISTS t3;
       
      CREATE TABLE t3(i3 int primary key, v3 int, key(v3)) engine=myisam;
       
      INSERT INTO t3 VALUES (2, 2);
      INSERT INTO t3 VALUES (4, 4);
      INSERT INTO t3 VALUES (6, 6);
      INSERT INTO t3 VALUES (8, 8);
       
      # view definition without WHERE
       
      DROP VIEW IF EXISTS v1;
       
      CREATE ALGORITHM=MERGE VIEW v1 AS
                        SELECT t1.i1 as i1, t1.v1 as v1,
                               t2.i2 as i2, t2.v2 as v2,
                               t3.i3 as i3, t3.v3 as v3
                          FROM t1
                          JOIN t2 on t1.i1 = t2.i2
                     LEFT JOIN t3 on t2.i2 = t3.i3
                         ;
       
      # view definition with WHERE
       
      DROP VIEW IF EXISTS v2;
       
      CREATE ALGORITHM=MERGE VIEW v2 AS
                        SELECT t1.i1 as i1, t1.v1 as v1,
                               t2.i2 as i2, t2.v2 as v2,
                               t3.i3 as i3, t3.v3 as v3
                          FROM t1 JOIN t2 on t1.i1 = t2.i2
                     LEFT JOIN t3 on t2.i2 = t3.i3
                         WHERE t1.i1 = t2.i2
                           AND 1 = 1
                         ;
       
      # query plan of plain query
       
      EXPLAIN EXTENDED SELECT t1.i1 as i1, t1.v1 as v1,
                              t2.i2 as i2, t2.v2 as v2,
                              t3.i3 as i3, t3.v3 as v3
                         FROM t1
                         JOIN t2 on t1.i1 = t2.i2
                    LEFT JOIN t3 on t2.i2 = t3.i3
                        WHERE 1 = 1
                          AND t3.v3 = 4
                        ;
       
      # query plan for view without WHERE
       
      EXPLAIN EXTENDED SELECT * FROM v1 WHERE v3 = 4;
       
      # query plan for view with  WHERE
       
      EXPLAIN EXTENDED SELECT * FROM v2 WHERE v3 = 4;
      

      The plans for the plain query, and for the VIEW without a WHERE condition inside the definition, look like this:

      Plain query:

      id      select_type     table   type    possible_keys   key     key_len ref         rows    filtered    Extra
      1       SIMPLE          t3      ref     PRIMARY,v3      v3      5       const       1       100.00  
      1       SIMPLE          t1      eq_ref  PRIMARY         PRIMARY 4       test.t3.i3  1       100.00  
      1       SIMPLE          t2      eq_ref  PRIMARY         PRIMARY 4       test.t3.i3  1       100.00  
      

      View without WHERE:

      id      select_type     table   type    possible_keys   key     key_len ref         rows    filtered    Extra
      1       SIMPLE          t3      ref     PRIMARY,v3      v3      5       const       1       100.00  
      1       SIMPLE          t1      eq_ref  PRIMARY         PRIMARY 4       test.t3.i3  1       100.00  
      1       SIMPLE          t2      eq_ref  PRIMARY         PRIMARY 4       test.t3.i3  1       100.00  
      

      So these two are identical.

      For the VIEW with the extra WHERE 1=1 in its definition the plan looks like this though:

      id      select_type     table   type    possible_keys   key     key_len ref         rows    filtered    Extra
      1       SIMPLE          t1      ALL     PRIMARY         NULL    NULL    NULL        9       100.00  
      1       SIMPLE          t2      eq_ref  PRIMARY         PRIMARY 4       test.t1.i1  1       100.00  
      1       SIMPLE          t3      eq_ref  PRIMARY,v3      PRIMARY 4       test.t1.i1  1       100.00      Using where
      

      So here table reordering didn't happen, and a "Using where" full scan of the VIEW results has to be done ....

      Attachments

        There are no Sub-Tasks for this issue.

        Activity

          Debugging simplify_joins:

          Enter simplify_joins(join_list= join->join_list= {w2}, conds= "t3.v3 = 4")
           
            table= w2, 
            table->on_expr= ((t1.i1 = t2.i2) and (1 = 1))
            table->nested_join= ...
           
            Enter simplify_joins(join_list= w2->nested_join={(nest_last_join)},
                                 conds="((t1.i1 = t2.i2) and (1 = 1))"
            {
           
              table= (nest_last_join) // denote as $nest1
              table->on_expr=NULL,
           
              Enter simplify_joins(join_list= $nest1->nested_join={t3, (nest_last_join)},
                                   conds="((t1.i1 = t2.i2) and (1 = 1))"
              {
                table= t3
                table->on_expr= (t2.i2 = t3.i3)
                table->outer_join=1
           
                // we assign:
                table->dep_tables |= table_on_expr_used_tables 
                // and it becomes:
                table->dep_tables=6
                // 6 = 4 (for t3 itself) + 2 (for t2)
                
                // interesting:
                table->dep_tables&= ~table->embedding->nested_join->used_tables;
                // here the right part has used_tables=4 
                // although embedding's nested join contains t3 and a join nest 
                // which includes t1 and t2.  I suppose, that used_tables value is incomplete
                // ATM.
                
                // Anyhow, after this, we have
                table->dep_tables=2
                // that is, "t3 must follow t2" which looks correct (outer join is not
                // converted to inner, yet).
           
           
                table= (nest_last_join); // denote as $nest2
                table->on_expr= 0;
                table->nested_join= {t1, t2}
           
           
                Enter simplify_joins(join_list= $nest2->nested_join={t2, t1},
                                     conds="((t1.i1 = t2.i2) and (1 = 1))"
                {
                  table= t2
                  table->on_expr= "(t1.i1 = t2.i2)"
                  table->outer_join= 0
                  // inject table->on_expr into conds.
           
                  table= t1
           
                  retval= "((t1.i1 = t2.i2) and (1 = 1) and (t1.i1 = t2.i2))"
           
                } // Leave simplify_joins(join_list=$nest2...)
           
                // The list {t3, (nest_last_join)} is flattened into
                // {t3, t2, t1}
           
                retval= "((t1.i1 = t2.i2) and (1 = 1) and (t1.i1 = t2.i2))"
              } // Leave simplify_joins(join_list=$nest1...)
           
              // the list {nest_last_join(t3, t2, t1)} is flattened into
              // {t3,t2,t1}
           
              retval= "((t1.i1 = t2.i2) and (1 = 1) and (t1.i1 = t2.i2))"
            } // Leave simplify_joins
            
            // Now, the second call to simplify joins, with 'conds' as parameter:
           
            Enter simplify_joins(join_list= w2->nested_join={t3, t2, t1},
                                 conds="(t3.v3 = 4)"
            {
           
              table= t3
              used_tables=4, not_null_tables=4
              // we set: table->outer_join= 0;
              // however, table->dep_tables=2 remains uncleaned
          

          same content here: https://gist.github.com/spetrunia/d531a08dc748fb01c694710885ac2bc7

          Summary:

          • the optimizer does convert t3's outer join to inner
          • however t3's dep_tables is not cleaned.
          psergei Sergei Petrunia added a comment - Debugging simplify_joins: Enter simplify_joins(join_list= join->join_list= {w2}, conds= "t3.v3 = 4")   table= w2, table->on_expr= ((t1.i1 = t2.i2) and (1 = 1)) table->nested_join= ...   Enter simplify_joins(join_list= w2->nested_join={(nest_last_join)}, conds="((t1.i1 = t2.i2) and (1 = 1))" {   table= (nest_last_join) // denote as $nest1 table->on_expr=NULL,   Enter simplify_joins(join_list= $nest1->nested_join={t3, (nest_last_join)}, conds="((t1.i1 = t2.i2) and (1 = 1))" { table= t3 table->on_expr= (t2.i2 = t3.i3) table->outer_join=1   // we assign: table->dep_tables |= table_on_expr_used_tables // and it becomes: table->dep_tables=6 // 6 = 4 (for t3 itself) + 2 (for t2) // interesting: table->dep_tables&= ~table->embedding->nested_join->used_tables; // here the right part has used_tables=4 // although embedding's nested join contains t3 and a join nest // which includes t1 and t2. I suppose, that used_tables value is incomplete // ATM. // Anyhow, after this, we have table->dep_tables=2 // that is, "t3 must follow t2" which looks correct (outer join is not // converted to inner, yet).     table= (nest_last_join); // denote as $nest2 table->on_expr= 0; table->nested_join= {t1, t2}     Enter simplify_joins(join_list= $nest2->nested_join={t2, t1}, conds="((t1.i1 = t2.i2) and (1 = 1))" { table= t2 table->on_expr= "(t1.i1 = t2.i2)" table->outer_join= 0 // inject table->on_expr into conds.   table= t1   retval= "((t1.i1 = t2.i2) and (1 = 1) and (t1.i1 = t2.i2))"   } // Leave simplify_joins(join_list=$nest2...)   // The list {t3, (nest_last_join)} is flattened into // {t3, t2, t1}   retval= "((t1.i1 = t2.i2) and (1 = 1) and (t1.i1 = t2.i2))" } // Leave simplify_joins(join_list=$nest1...)   // the list {nest_last_join(t3, t2, t1)} is flattened into // {t3,t2,t1}   retval= "((t1.i1 = t2.i2) and (1 = 1) and (t1.i1 = t2.i2))" } // Leave simplify_joins // Now, the second call to simplify joins, with 'conds' as parameter:   Enter simplify_joins(join_list= w2->nested_join={t3, t2, t1}, conds="(t3.v3 = 4)" {   table= t3 used_tables=4, not_null_tables=4 // we set: table->outer_join= 0; // however, table->dep_tables=2 remains uncleaned same content here: https://gist.github.com/spetrunia/d531a08dc748fb01c694710885ac2bc7 Summary: the optimizer does convert t3's outer join to inner however t3's dep_tables is not cleaned.

          This patch fixes the issue and passes the test suite:

          diff --git a/sql/sql_select.cc b/sql/sql_select.cc
          index e83f63d..d9849ce 100644
          --- a/sql/sql_select.cc
          +++ b/sql/sql_select.cc
          @@ -14418,6 +14418,9 @@ static void update_const_equal_items(THD *thd, COND *cond, JOIN_TAB *tab,
                 if (table->outer_join && !table->embedding && table->table)
                   table->table->maybe_null= FALSE;
                 table->outer_join= 0;
          +      //psergey:
          +      if (!(straight_join || table->straight))
          +        table->dep_tables= table->embedding? table->embedding->dep_tables: 0;
                 if (table->on_expr)
                 {
                   /* Add ON expression to the WHERE or upper-level ON condition. */
          

          psergei Sergei Petrunia added a comment - This patch fixes the issue and passes the test suite: diff --git a/sql/sql_select.cc b/sql/sql_select.cc index e83f63d..d9849ce 100644 --- a/sql/sql_select.cc +++ b/sql/sql_select.cc @@ -14418,6 +14418,9 @@ static void update_const_equal_items(THD *thd, COND *cond, JOIN_TAB *tab, if (table->outer_join && !table->embedding && table->table) table->table->maybe_null= FALSE; table->outer_join= 0; + //psergey: + if (!(straight_join || table->straight)) + table->dep_tables= table->embedding? table->embedding->dep_tables: 0; if (table->on_expr) { /* Add ON expression to the WHERE or upper-level ON condition. */

          elenst, I need

          • a test of 10.1 vs 10.1 with the above patch
          • we need a grammar with nested LEFT/INNER joins, with various ON expressions
          • ON expression must be simple predicates over null-able as well as NOT NULL columns
          • The primary risk (thing to check for) is different query results. I don't expect crashes.
          psergei Sergei Petrunia added a comment - elenst , I need a test of 10.1 vs 10.1 with the above patch we need a grammar with nested LEFT/INNER joins, with various ON expressions ON expression must be simple predicates over null-able as well as NOT NULL columns The primary risk (thing to check for) is different query results. I don't expect crashes.

          I ran comparison tests on the patched version against 10.1 without the patch and transformation tests with ExecuteAsView, haven't got any mismatches.
          The testing task MDEV-10031 will remain open until the patch passes the review, in case there are changes that require retesting.

          elenst Elena Stepanova added a comment - I ran comparison tests on the patched version against 10.1 without the patch and transformation tests with ExecuteAsView, haven't got any mismatches. The testing task MDEV-10031 will remain open until the patch passes the review, in case there are changes that require retesting.

          No changes were requested during the code review. The patch will be pushed as is.

          psergei Sergei Petrunia added a comment - No changes were requested during the code review. The patch will be pushed as is.

          People

            psergei Sergei Petrunia
            hholzgra Hartmut Holzgraefe
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.