Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6706

Wrong result (missing rows) with joins, SQ, ORDER BY, semijoin=on

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 5.3.12, 5.5.39, 10.0.13, 10.1.30
    • 5.5.59
    • Optimizer
    • None
    • 5.5.59

    Description

      Test case:

      CREATE TABLE t1 (f1 VARCHAR(3)) ENGINE=MyISAM;
      INSERT INTO t1 VALUES ('foo');
       
      CREATE TABLE t2 (f2 VARCHAR(3)) ENGINE=MyISAM;
      INSERT INTO t2 VALUES ('bar'),('baz');
       
      CREATE TABLE t3 (i3_key INT, f3_key VARCHAR(3), f3 VARCHAR(3), KEY(f3_key,i3_key)) ENGINE=MyISAM;
      INSERT INTO t3 VALUES (0,'qux','qux'),(8,'bar','bar');
       
      SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY ( 
        SELECT f1
        FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 ) 
        WHERE t3a.f3 < f1 OR t3b.f3 != f1 
      )
      ORDER BY field;

      Actual result:

      field
      foobar

      Result without ORDER BY or with semijoin=off:

      field
      foobar
      foobaz

      The problem appeared in 5.3 tree with the following revision:

      revno: 3604 [merge]
      revision-id: igor@askmonty.org-20121122183039-dm575rqkw40558qp
      parent: sergii@pisem.net-20121117155015-4ab41ncach4iavao
      parent: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o
      committer: Igor Babaev <igor@askmonty.org>
      branch nick: maria-5.3
      timestamp: Thu 2012-11-22 10:30:39 -0800
      message:
        Merge
          ------------------------------------------------------------
          revno: 3602.1.1
          revision-id: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o
          parent: knielsen@knielsen-hq.org-20121120125749-l55gvliervi5nnaf
          committer: Igor Babaev <igor@askmonty.org>
          branch nick: maria-5.3-mdev645
          timestamp: Wed 2012-11-21 21:55:04 -0800
          message:
            Fixed LP bug #1002146 (bug mdev-645).
            If the setting of system variables does not allow to use join buffer
            for a join query with GROUP BY <f1,...> / ORDER BY <f1,...> then
            filesort is not needed if the first joined table is scanned in
            the order compatible with order specified by the list <f1,...>.

      Attachments

        Activity

          elenst Elena Stepanova created issue -
          elenst Elena Stepanova made changes -
          Field Original Value New Value
          Description Test case:
          {code:sql}
          CREATE TABLE t1 (f1 VARCHAR(3)) ENGINE=MyISAM;
          INSERT INTO t1 VALUES ('foo');

          CREATE TABLE t2 (f2 VARCHAR(3)) ENGINE=MyISAM;
          INSERT INTO t2 VALUES ('bar'),('baz');

          CREATE TABLE t3 (i3_key INT, f3_key VARCHAR(3), f3 VARCHAR(3), KEY(f3_key,i3_key)) ENGINE=MyISAM;
          INSERT INTO t3 VALUES (0,'qux','qux'),(8,'bar','bar');

          SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY (
            SELECT f1
            FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 )
            WHERE t3a.f3 < f1 OR t3b.f3 != f1
          )
          ORDER BY field;
          {code}

          Actual result:
          {code:sql}
          SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY (
          SELECT f1
          FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 )
          WHERE t3a.f3 < f1 OR t3b.f3 != f1
          )
          ORDER BY field;
          field
          foobar
          {code}

          Result without ORDER BY:
          {code:sql}
          SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY (
          SELECT f1
          FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 )
          WHERE t3a.f3 < f1 OR t3b.f3 != f1
          );
          field
          foobar
          foobaz
          {code}

          The problem appeared in 5.3 tree with the following revision:
          {noformat}
          revno: 3604 [merge]
          revision-id: igor@askmonty.org-20121122183039-dm575rqkw40558qp
          parent: sergii@pisem.net-20121117155015-4ab41ncach4iavao
          parent: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o
          committer: Igor Babaev <igor@askmonty.org>
          branch nick: maria-5.3
          timestamp: Thu 2012-11-22 10:30:39 -0800
          message:
            Merge
              ------------------------------------------------------------
              revno: 3602.1.1
              revision-id: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o
              parent: knielsen@knielsen-hq.org-20121120125749-l55gvliervi5nnaf
              committer: Igor Babaev <igor@askmonty.org>
              branch nick: maria-5.3-mdev645
              timestamp: Wed 2012-11-21 21:55:04 -0800
              message:
                Fixed LP bug #1002146 (bug mdev-645).
                If the setting of system variables does not allow to use join buffer
                for a join query with GROUP BY <f1,...> / ORDER BY <f1,...> then
                filesort is not needed if the first joined table is scanned in
                the order compatible with order specified by the list <f1,...>.
          {noformat}
          Test case:
          {code:sql}
          CREATE TABLE t1 (f1 VARCHAR(3)) ENGINE=MyISAM;
          INSERT INTO t1 VALUES ('foo');

          CREATE TABLE t2 (f2 VARCHAR(3)) ENGINE=MyISAM;
          INSERT INTO t2 VALUES ('bar'),('baz');

          CREATE TABLE t3 (i3_key INT, f3_key VARCHAR(3), f3 VARCHAR(3), KEY(f3_key,i3_key)) ENGINE=MyISAM;
          INSERT INTO t3 VALUES (0,'qux','qux'),(8,'bar','bar');

          SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY (
            SELECT f1
            FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 )
            WHERE t3a.f3 < f1 OR t3b.f3 != f1
          )
          ORDER BY field;
          {code}

          Actual result:
          {code:sql}
          field
          foobar
          {code}

          Result without {{ORDER BY}} or with {{semijoin=off}}:
          {code:sql}
          field
          foobar
          foobaz
          {code}

          The problem appeared in 5.3 tree with the following revision:
          {noformat}
          revno: 3604 [merge]
          revision-id: igor@askmonty.org-20121122183039-dm575rqkw40558qp
          parent: sergii@pisem.net-20121117155015-4ab41ncach4iavao
          parent: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o
          committer: Igor Babaev <igor@askmonty.org>
          branch nick: maria-5.3
          timestamp: Thu 2012-11-22 10:30:39 -0800
          message:
            Merge
              ------------------------------------------------------------
              revno: 3602.1.1
              revision-id: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o
              parent: knielsen@knielsen-hq.org-20121120125749-l55gvliervi5nnaf
              committer: Igor Babaev <igor@askmonty.org>
              branch nick: maria-5.3-mdev645
              timestamp: Wed 2012-11-21 21:55:04 -0800
              message:
                Fixed LP bug #1002146 (bug mdev-645).
                If the setting of system variables does not allow to use join buffer
                for a join query with GROUP BY <f1,...> / ORDER BY <f1,...> then
                filesort is not needed if the first joined table is scanned in
                the order compatible with order specified by the list <f1,...>.
          {noformat}
          ratzpo Rasmus Johansson (Inactive) made changes -
          Workflow MariaDB v2 [ 53738 ] MariaDB v3 [ 64799 ]
          igor Igor Babaev (Inactive) made changes -
          Assignee Sergei Petrunia [ psergey ] Igor Babaev [ igor ]
          igor Igor Babaev (Inactive) made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          serg Sergei Golubchik made changes -
          Sprint 5.5.59 [ 221 ]
          igor Igor Babaev (Inactive) added a comment - - edited

          The code of get_sort_by_table() that checks that ordering can be done using fields of only one table

            while ((table= ti++) && !(map & table->table->map)) ;
            if (map != table->table->map)
              DBUG_RETURN(0);                             // More than one table
          

          was incorrect. The code uses the previously calculated value of 'map' containing table_map bits for all tables the order by items depend on. It includes bits for all const (single_row) tables that should be excluded before the the loop in the above code starts.

            map&= ~const_tables;
            while ((table= ti++) && !(map & table->table->map)) ;
            if (map != table->table->map)
              DBUG_RETURN(0);                             // More than one table
          

          The function get_sort_by_table() is called in make_join_statistics() to set the value of JOIN::sort_by_table.
          Later this value can be used by the function setup_semijoin_dups_elimination() in the following code. (The function JOIN::JOIN_TAB *get_sort_by_join_tab() returns 0 if sort_by_table set to 0.)

                      /* 
                        Make sure that possible sorting of rows from the head table 
                        is not to be employed.
                      */
                      if (join->get_sort_by_join_tab())
          	    {
                        join->simple_order= 0;
                        join->simple_group= 0;
                        join->need_tmp= join->test_if_need_tmp_table();
                      }
          

          This code is executed for the reported query. The code prohibits using sorting by the first non constant joined table and, as a result, it requires creation of a temporary table containing all rows from the result set. Sorting the records from the temporary table produces the necessary ordering.

          Let's see what's going on when executing the reported query with the current code. The call of the function get_sort_by_table() set JOIN::sort_by_table to 0. As a result the above call join->get_sort_by_join_tab() returns 0 and the code join->simple_order= 0; is skipped. So the value of join->simple_order is not changed and it remains the same as it was set by the previously called function remove_const that does not use the value of join->sort_by_table. After the above 'if' statement the function setup_semijoin_dups_elimination() calls init_dups_weedout(). The latter sets JOIN_TAB::keep_current_rowid to TRUE for the third joined table, namely for the table t2. It also sets the field JOIN::flush_weedout_table for this table.
          The execution plan chosen for the query requires a sequential scan of the table t2. At the beginning of this scan the function SJ_TMP_TABLE::sj_weedout_delete_rows() is called. This function deletes all rows from the temporary table used for weedout. The function is called only once.
          Then the execution process tries to build rows of the result set. It starts from reading the first record from t2. Yet as the join was marked with the flag simple_order the needed fields of the records of t2 first are sorted and placed into a buffer. So actually the records of t2 are read from this buffer by the call-back function rr_unpack_from_buffer(). After a record of t2 has been read into the record buffer for t2 its rowid is saved into the temporary table used for weedout due to setting of keep_current_rowid for t2 to true. However rr_unpack_from_buffer() cannot read this rowid from the buffer because it has never been written there. So it reads 'something' that is interpreted as a rowid and this 'something' is always the same. That's why the result set contains only one row no matter how many rows contains t2.

          igor Igor Babaev (Inactive) added a comment - - edited The code of get_sort_by_table() that checks that ordering can be done using fields of only one table while ((table= ti++) && !(map & table->table->map)) ; if (map != table->table->map) DBUG_RETURN(0); // More than one table was incorrect. The code uses the previously calculated value of 'map' containing table_map bits for all tables the order by items depend on. It includes bits for all const (single_row) tables that should be excluded before the the loop in the above code starts. map&= ~const_tables; while ((table= ti++) && !(map & table->table->map)) ; if (map != table->table->map) DBUG_RETURN(0); // More than one table The function get_sort_by_table() is called in make_join_statistics() to set the value of JOIN::sort_by_table. Later this value can be used by the function setup_semijoin_dups_elimination() in the following code. (The function JOIN::JOIN_TAB *get_sort_by_join_tab() returns 0 if sort_by_table set to 0.) /* Make sure that possible sorting of rows from the head table is not to be employed. */ if (join->get_sort_by_join_tab()) { join->simple_order= 0; join->simple_group= 0; join->need_tmp= join->test_if_need_tmp_table(); } This code is executed for the reported query. The code prohibits using sorting by the first non constant joined table and, as a result, it requires creation of a temporary table containing all rows from the result set. Sorting the records from the temporary table produces the necessary ordering. Let's see what's going on when executing the reported query with the current code. The call of the function get_sort_by_table() set JOIN::sort_by_table to 0. As a result the above call join->get_sort_by_join_tab() returns 0 and the code join->simple_order= 0; is skipped. So the value of join->simple_order is not changed and it remains the same as it was set by the previously called function remove_const that does not use the value of join->sort_by_table. After the above 'if' statement the function setup_semijoin_dups_elimination() calls init_dups_weedout(). The latter sets JOIN_TAB::keep_current_rowid to TRUE for the third joined table, namely for the table t2. It also sets the field JOIN::flush_weedout_table for this table. The execution plan chosen for the query requires a sequential scan of the table t2. At the beginning of this scan the function SJ_TMP_TABLE::sj_weedout_delete_rows() is called. This function deletes all rows from the temporary table used for weedout. The function is called only once. Then the execution process tries to build rows of the result set. It starts from reading the first record from t2. Yet as the join was marked with the flag simple_order the needed fields of the records of t2 first are sorted and placed into a buffer. So actually the records of t2 are read from this buffer by the call-back function rr_unpack_from_buffer(). After a record of t2 has been read into the record buffer for t2 its rowid is saved into the temporary table used for weedout due to setting of keep_current_rowid for t2 to true. However rr_unpack_from_buffer() cannot read this rowid from the buffer because it has never been written there. So it reads 'something' that is interpreted as a rowid and this 'something' is always the same. That's why the result set contains only one row no matter how many rows contains t2.
          igor Igor Babaev (Inactive) made changes -
          Fix Version/s 10.0 [ 16000 ]
          igor Igor Babaev (Inactive) made changes -
          Affects Version/s 10.1.30 [ 22637 ]
          igor Igor Babaev (Inactive) made changes -
          Assignee Igor Babaev [ igor ] Sergei Petrunia [ psergey ]
          Status In Progress [ 3 ] In Review [ 10002 ]
          igor Igor Babaev (Inactive) made changes -
          Assignee Sergei Petrunia [ psergey ] Igor Babaev [ igor ]
          igor Igor Babaev (Inactive) made changes -
          Assignee Igor Babaev [ igor ] Oleksandr Byelkin [ sanja ]

          OK to push.

          sanja Oleksandr Byelkin added a comment - OK to push.
          sanja Oleksandr Byelkin made changes -
          Assignee Oleksandr Byelkin [ sanja ] Igor Babaev [ igor ]
          Status In Review [ 10002 ] Stalled [ 10000 ]
          serg Sergei Golubchik made changes -
          Component/s Optimizer [ 10200 ]
          Fix Version/s 5.5.59 [ 22612 ]
          Fix Version/s 5.5 [ 15800 ]
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]
          igor Igor Babaev (Inactive) made changes -
          Resolution Fixed [ 1 ]
          Status Closed [ 6 ] Stalled [ 10000 ]

          A fix for this bug was pushed into the 5.5 tree.

          igor Igor Babaev (Inactive) added a comment - A fix for this bug was pushed into the 5.5 tree.
          igor Igor Babaev (Inactive) made changes -
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 64799 ] MariaDB v4 [ 148205 ]

          People

            igor Igor Babaev (Inactive)
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.