Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30975

Wrong result with cross Join given join order

Details

    Description

      DROP TABLE t1;
      DROP TABLE t2;
       
      CREATE TABLE `t1` (
      	`t1_seq` INT NOT NULL, 
      	`c1` VARCHAR(10) NOT NULL ,
      	PRIMARY KEY (`t1_seq`) USING BTREE
      ) ENGINE=INNODB;
       
      CREATE TABLE `t2` (
      	`t2_seq` INT NOT NULL,
      	`t1_seq` INT NOT NULL,
      	`c2` VARCHAR(10) NOT NULL ,
      	PRIMARY KEY (`t2_seq`, `t1_seq`) USING BTREE
      ) ENGINE=INNODB;
       
      INSERT INTO t1 VALUES(1, 'A');
      INSERT INTO t2 VALUES(1, 1, 'T2-1-1');
      INSERT INTO t2 VALUES(2, 1, 'T2-1-2');
      INSERT INTO t2 VALUES(3, 1, 'T2-1-3');
       
      MariaDB [testdb]> SELECT @@version;
      +-----------------+
      | @@version       |
      +-----------------+
      | 10.11.2-MariaDB |
      +-----------------+
      1 ROW IN SET (0.000 sec)
       
      SELECT LPAD(@rownum := @rownum + 1, 8, 0) AS str_num
           , t1.t1_seq
           , t2.t2_seq
           , t1.c1
           , t2.c2
        FROM t1 
             INNER JOIN t2 ON (t1.t1_seq = t2.t1_seq)
             CROSS JOIN ( SELECT @rownum := 0 ) X
      ; 
       
      +------+-------------+------------+--------+---------------+------+---------+------+------+-------------------------------------------------+
      | id   | select_type | TABLE      | TYPE   | possible_keys | KEY  | key_len | ref  | ROWS | Extra                                           |
      +------+-------------+------------+--------+---------------+------+---------+------+------+-------------------------------------------------+
      |    1 | PRIMARY     | <derived2> | system | NULL          | NULL | NULL    | NULL | 1    |                                                 |
      |    1 | PRIMARY     | t1         | ALL    | PRIMARY       | NULL | NULL    | NULL | 1    |                                                 |
      |    1 | PRIMARY     | t2         | ALL    | NULL          | NULL | NULL    | NULL | 3    | USING WHERE; USING JOIN buffer (flat, BNL JOIN) |
      |    2 | DERIVED     | NULL       | NULL   | NULL          | NULL | NULL    | NULL | NULL | NO TABLES used                                  |
      +------+-------------+------------+--------+---------------+------+---------+------+------+-------------------------------------------------+
      4 ROWS IN SET (0.000 sec)
       
      +----------+--------+--------+----+--------+
      | str_num  | t1_seq | t2_seq | c1 | c2     |
      +----------+--------+--------+----+--------+
      | 00000001 |      1 |      1 | A  | T2-1-1 |
      | 00000002 |      1 |      2 | A  | T2-1-2 |
      | 00000003 |      1 |      3 | A  | T2-1-3 |
      +----------+--------+--------+----+--------+
      3 ROWS IN SET (0.002 sec)
      

      STRAIGHT_JOIN:

      SELECT STRAIGHT_JOIN LPAD(@rownum := @rownum + 1, 8, 0) AS str_num
           , t1.t1_seq
           , t2.t2_seq
           , t1.c1
           , t2.c2
        FROM t1 
             INNER JOIN t2 ON (t1.t1_seq = t2.t1_seq)
             CROSS JOIN ( SELECT @rownum := 0 ) X
      ; 
       
      +------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+
      | id   | select_type | TABLE      | TYPE | possible_keys | KEY  | key_len | ref  | ROWS | Extra                                           |
      +------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+
      |    1 | PRIMARY     | t1         | ALL  | PRIMARY       | NULL | NULL    | NULL | 1    |                                                 |
      |    1 | PRIMARY     | t2         | ALL  | NULL          | NULL | NULL    | NULL | 3    | USING WHERE; USING JOIN buffer (flat, BNL JOIN) |
      |    1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL | 1    | USING JOIN buffer (incremental, BNL JOIN)       |
      |    2 | DERIVED     | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL | NO TABLES used                                  |
      +------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+
      4 ROWS IN SET (0.000 sec)
       
      MariaDB [lololdb]> 
       
      +----------+--------+--------+----+--------+
      | str_num  | t1_seq | t2_seq | c1 | c2     |
      +----------+--------+--------+----+--------+
      | 00000001 |      1 |      1 | A  | T2-1-1 |
      | 00000002 |      1 |      2 | A  | T2-1-2 |
      | 00000003 |      1 |      3 | A  | T2-1-3 |
      | 00000004 |      1 |      1 | A  | T2-1-1 |
      | 00000005 |      1 |      2 | A  | T2-1-2 |
      | 00000006 |      1 |      3 | A  | T2-1-3 |
      +----------+--------+--------+----+--------+
      6 ROWS IN SET (0.001 sec)
       
      MariaDB [testdb]> 
      

      Attachments

        Activity

          alice Alice Sherepa added a comment -

          Thank you for the report! I repeated as described on 10.4-10.11, Myisam/InnoDB:

          MariaDB [test]> SELECT STRAIGHT_JOIN * FROM t1 JOIN t2 ON (t1.t1_seq = t2.t1_seq) JOIN  (SELECT @a := 0) x;
          +--------+----+--------+--------+--------+---------+
          | t1_seq | c1 | t2_seq | t1_seq | c2     | @a := 0 |
          +--------+----+--------+--------+--------+---------+
          |      1 | A  |      1 |      1 | T2-1-1 |       0 |
          |      1 | A  |      2 |      1 | T2-1-2 |       0 |
          |      1 | A  |      3 |      1 | T2-1-3 |       0 |
          |      1 | A  |      1 |      1 | T2-1-1 |       0 |
          |      1 | A  |      2 |      1 | T2-1-2 |       0 |
          |      1 | A  |      3 |      1 | T2-1-3 |       0 |
          +--------+----+--------+--------+--------+---------+
          6 rows in set (0.001 sec)
           
          MariaDB [test]> SELECT  * FROM t1 JOIN t2 ON (t1.t1_seq = t2.t1_seq) JOIN  (SELECT @a := 0) x;
          +--------+----+--------+--------+--------+---------+
          | t1_seq | c1 | t2_seq | t1_seq | c2     | @a := 0 |
          +--------+----+--------+--------+--------+---------+
          |      1 | A  |      1 |      1 | T2-1-1 |       0 |
          |      1 | A  |      2 |      1 | T2-1-2 |       0 |
          |      1 | A  |      3 |      1 | T2-1-3 |       0 |
          +--------+----+--------+--------+--------+---------+
          3 rows in set (0.001 sec)
          

          alice Alice Sherepa added a comment - Thank you for the report! I repeated as described on 10.4-10.11, Myisam/InnoDB: MariaDB [test]> SELECT STRAIGHT_JOIN * FROM t1 JOIN t2 ON (t1.t1_seq = t2.t1_seq) JOIN (SELECT @a := 0) x; +--------+----+--------+--------+--------+---------+ | t1_seq | c1 | t2_seq | t1_seq | c2 | @a := 0 | +--------+----+--------+--------+--------+---------+ | 1 | A | 1 | 1 | T2-1-1 | 0 | | 1 | A | 2 | 1 | T2-1-2 | 0 | | 1 | A | 3 | 1 | T2-1-3 | 0 | | 1 | A | 1 | 1 | T2-1-1 | 0 | | 1 | A | 2 | 1 | T2-1-2 | 0 | | 1 | A | 3 | 1 | T2-1-3 | 0 | +--------+----+--------+--------+--------+---------+ 6 rows in set (0.001 sec)   MariaDB [test]> SELECT * FROM t1 JOIN t2 ON (t1.t1_seq = t2.t1_seq) JOIN (SELECT @a := 0) x; +--------+----+--------+--------+--------+---------+ | t1_seq | c1 | t2_seq | t1_seq | c2 | @a := 0 | +--------+----+--------+--------+--------+---------+ | 1 | A | 1 | 1 | T2-1-1 | 0 | | 1 | A | 2 | 1 | T2-1-2 | 0 | | 1 | A | 3 | 1 | T2-1-3 | 0 | +--------+----+--------+--------+--------+---------+ 3 rows in set (0.001 sec)
          Gosselin Dave Gosselin added a comment - - edited

          Any line number references below are with respect to 10.4 at git sha 88c46aba753c00.

          The apparent root of this issue is that we attempt to fill a derived table more than once, but without clearing its rows, so it consequently has duplicate rows. It doesn't matter if we enable join cache or not, or if we enable condition pushdown for derived optimization or not; the results are the same. The queries on this ticket may be simplified to "SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @a := 0) x;" to observe the result of this problem. This query has a derived table for "(SELECT @a := 0) x" which is backed by a temporary table and populated during mysql_derived_fill. The fact that this derived table sets a session variable causes us to mark the corresponding LEX as UNCACHEABLE_SIDEEFFECT. This is important later. Below I've included two stack traces from the execution of "SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @a := 0) x;" to show that we enter mysql_derived_fill twice (for simplicity of stack traces, both join cache and pushdown condition are disabled). During the "First Time", unit->executed is false and unit->uncacheable is 4 (sideeffect) and all is OK: we populate the derived table. But during the "Second Time", unit->executed is true and unit->uncacheable is still 4, so we again populate the table. During today's SQL Processor call meeting, we discussed several possible fixes. Since mysql_derived_fill will, for UNCACHEABLE_DEPENDENT tables, drop all rows and repopulate, we thought it reasonable to relax the condition at line 1204. Rather than !unit->uncacheable, we consider !(unit->uncacheable & UNCACHEABLE_DEPENDENT) to be more correct because we only populate derived tables once unless they're dependent tables (subquery has fields from outer query). Indeed such a change passes all of our mtr tests, but would like an opinion from igor

          First Time
          #0 mysql_derived_fill (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1204
          #1 0x0000555555d7c216 in mysql_derived_optimize (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1049
          #2 0x0000555555d79bd1 in mysql_handle_single_derived (lex=0x7fff68004bf8, derived=0x7fff680175d8, phases=4) at /nvme/1/server/sql/sql_derived.cc:200
          #3 0x0000555555f26bd0 in TABLE_LIST::handle_derived (this=0x7fff680175d8, lex=0x7fff68004bf8, phases=4) at /nvme/1/server/sql/table.cc:9090
          #4 0x0000555555d96039 in LEX::handle_list_of_derived (this=0x7fff68004bf8, table_list=0x7fff68015f50, phases=4) at /nvme/1/server/sql/sql_lex.h:4455
          #5 0x0000555555da2c96 in st_select_lex::handle_derived (this=0x7fff68015920, lex=0x7fff68004bf8, phases=4) at /nvme/1/server/sql/sql_lex.cc:4417
          #6 0x0000555555e268e4 in JOIN::optimize_inner (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:2229
          #7 0x0000555555e24ad4 in JOIN::optimize (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:1731
          #8 0x0000555555e30459 in mysql_select (thd=0x7fff68000d90, tables=0x7fff68015f50, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748610, result=0x7fff68018d88, unit=0x7fff68004cb8, select_lex=0x7fff68015920) at /nvme/1/server/sql/sql_select.cc:4832

          Second Time
          #0 mysql_derived_fill (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1204
          #1 0x0000555555d79bd1 in mysql_handle_single_derived (lex=0x7fff68004bf8, derived=0x7fff680175d8, phases=96) at /nvme/1/server/sql/sql_derived.cc:200
          #2 0x0000555555e4a152 in st_join_table::preread_init (this=0x7fff68096918) at /nvme/1/server/sql/sql_select.cc:13886
          #3 0x0000555555e5cb27 in sub_select (join=0x7fff68018db0, join_tab=0x7fff68096918, end_of_records=false) at /nvme/1/server/sql/sql_select.cc:20884
          #4 0x0000555555e5d56e in evaluate_join_record (join=0x7fff68018db0, join_tab=0x7fff68096568, error=0) at /nvme/1/server/sql/sql_select.cc:21149
          #5 0x0000555555e5cd79 in sub_select (join=0x7fff68018db0, join_tab=0x7fff68096568, end_of_records=false) at /nvme/1/server/sql/sql_select.cc:20922
          #6 0x0000555555e5c051 in do_select (join=0x7fff68018db0, procedure=0x0) at /nvme/1/server/sql/sql_select.cc:20443
          #7 0x0000555555e2fc4a in JOIN::exec_inner (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:4625
          #8 0x0000555555e2ed1e in JOIN::exec (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:4407
          #9 0x0000555555e304f2 in mysql_select (thd=0x7fff68000d90, tables=0x7fff68015f50, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748610, result=0x7fff68018d88, unit=0x7fff68004cb8, select_lex=0x7fff68015920) at /nvme/1/server/sql/sql_select.cc:4846

          Gosselin Dave Gosselin added a comment - - edited Any line number references below are with respect to 10.4 at git sha 88c46aba753c00. The apparent root of this issue is that we attempt to fill a derived table more than once, but without clearing its rows, so it consequently has duplicate rows. It doesn't matter if we enable join cache or not, or if we enable condition pushdown for derived optimization or not; the results are the same. The queries on this ticket may be simplified to "SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @a := 0) x;" to observe the result of this problem. This query has a derived table for "(SELECT @a := 0) x" which is backed by a temporary table and populated during mysql_derived_fill. The fact that this derived table sets a session variable causes us to mark the corresponding LEX as UNCACHEABLE_SIDEEFFECT. This is important later. Below I've included two stack traces from the execution of "SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @a := 0) x;" to show that we enter mysql_derived_fill twice (for simplicity of stack traces, both join cache and pushdown condition are disabled). During the "First Time", unit->executed is false and unit->uncacheable is 4 (sideeffect) and all is OK: we populate the derived table. But during the "Second Time", unit->executed is true and unit->uncacheable is still 4, so we again populate the table. During today's SQL Processor call meeting, we discussed several possible fixes. Since mysql_derived_fill will, for UNCACHEABLE_DEPENDENT tables, drop all rows and repopulate, we thought it reasonable to relax the condition at line 1204. Rather than !unit->uncacheable, we consider !(unit->uncacheable & UNCACHEABLE_DEPENDENT) to be more correct because we only populate derived tables once unless they're dependent tables (subquery has fields from outer query). Indeed such a change passes all of our mtr tests, but would like an opinion from igor First Time #0 mysql_derived_fill (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1204 #1 0x0000555555d7c216 in mysql_derived_optimize (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1049 #2 0x0000555555d79bd1 in mysql_handle_single_derived (lex=0x7fff68004bf8, derived=0x7fff680175d8, phases=4) at /nvme/1/server/sql/sql_derived.cc:200 #3 0x0000555555f26bd0 in TABLE_LIST::handle_derived (this=0x7fff680175d8, lex=0x7fff68004bf8, phases=4) at /nvme/1/server/sql/table.cc:9090 #4 0x0000555555d96039 in LEX::handle_list_of_derived (this=0x7fff68004bf8, table_list=0x7fff68015f50, phases=4) at /nvme/1/server/sql/sql_lex.h:4455 #5 0x0000555555da2c96 in st_select_lex::handle_derived (this=0x7fff68015920, lex=0x7fff68004bf8, phases=4) at /nvme/1/server/sql/sql_lex.cc:4417 #6 0x0000555555e268e4 in JOIN::optimize_inner (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:2229 #7 0x0000555555e24ad4 in JOIN::optimize (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:1731 #8 0x0000555555e30459 in mysql_select (thd=0x7fff68000d90, tables=0x7fff68015f50, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748610, result=0x7fff68018d88, unit=0x7fff68004cb8, select_lex=0x7fff68015920) at /nvme/1/server/sql/sql_select.cc:4832 Second Time #0 mysql_derived_fill (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1204 #1 0x0000555555d79bd1 in mysql_handle_single_derived (lex=0x7fff68004bf8, derived=0x7fff680175d8, phases=96) at /nvme/1/server/sql/sql_derived.cc:200 #2 0x0000555555e4a152 in st_join_table::preread_init (this=0x7fff68096918) at /nvme/1/server/sql/sql_select.cc:13886 #3 0x0000555555e5cb27 in sub_select (join=0x7fff68018db0, join_tab=0x7fff68096918, end_of_records=false) at /nvme/1/server/sql/sql_select.cc:20884 #4 0x0000555555e5d56e in evaluate_join_record (join=0x7fff68018db0, join_tab=0x7fff68096568, error=0) at /nvme/1/server/sql/sql_select.cc:21149 #5 0x0000555555e5cd79 in sub_select (join=0x7fff68018db0, join_tab=0x7fff68096568, end_of_records=false) at /nvme/1/server/sql/sql_select.cc:20922 #6 0x0000555555e5c051 in do_select (join=0x7fff68018db0, procedure=0x0) at /nvme/1/server/sql/sql_select.cc:20443 #7 0x0000555555e2fc4a in JOIN::exec_inner (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:4625 #8 0x0000555555e2ed1e in JOIN::exec (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:4407 #9 0x0000555555e304f2 in mysql_select (thd=0x7fff68000d90, tables=0x7fff68015f50, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748610, result=0x7fff68018d88, unit=0x7fff68004cb8, select_lex=0x7fff68015920) at /nvme/1/server/sql/sql_select.cc:4846

          Gosselin: yes, your suggested change would be appropriate.

          igor Igor Babaev (Inactive) added a comment - Gosselin : yes, your suggested change would be appropriate.
          Gosselin Dave Gosselin added a comment - PR: https://github.com/MariaDB/server/pull/2999

          Ok to push after input in the PR is addressed.

          psergei Sergei Petrunia added a comment - Ok to push after input in the PR is addressed.

          ... and the fix should go into 10.5 since 10.4 is locked already.

          psergei Sergei Petrunia added a comment - ... and the fix should go into 10.5 since 10.4 is locked already.

          Note for the changelog:

          A query that uses a derived table that employs constructs with side-effects (like (SELECT @var:=... ) as derived_tbl could produce wrong results.

          psergei Sergei Petrunia added a comment - Note for the changelog: A query that uses a derived table that employs constructs with side-effects (like (SELECT @var:=... ) as derived_tbl could produce wrong results.

          People

            Gosselin Dave Gosselin
            sanedl NAMHEE KIM
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.