[MDEV-30975] Wrong result with cross Join given join order Created: 2023-03-31  Updated: 2024-01-15

Status: In Review
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.11.2, 10.3.38, 10.4.28, 10.6.12
Fix Version/s: 10.4

Type: Bug Priority: Critical
Reporter: NAMHEE KIM Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: wrong_result
Environment:

CentOS 7.9, MariaDB 10.3 ~ 10.11



 Description   

DROP TABLE t1;
DROP TABLE t2;
 
CREATE TABLE `t1` (
	`t1_seq` INT NOT NULL, 
	`c1` VARCHAR(10) NOT NULL ,
	PRIMARY KEY (`t1_seq`) USING BTREE
) ENGINE=INNODB;
 
CREATE TABLE `t2` (
	`t2_seq` INT NOT NULL,
	`t1_seq` INT NOT NULL,
	`c2` VARCHAR(10) NOT NULL ,
	PRIMARY KEY (`t2_seq`, `t1_seq`) USING BTREE
) ENGINE=INNODB;
 
INSERT INTO t1 VALUES(1, 'A');
INSERT INTO t2 VALUES(1, 1, 'T2-1-1');
INSERT INTO t2 VALUES(2, 1, 'T2-1-2');
INSERT INTO t2 VALUES(3, 1, 'T2-1-3');
 
MariaDB [testdb]> SELECT @@version;
+-----------------+
| @@version       |
+-----------------+
| 10.11.2-MariaDB |
+-----------------+
1 ROW IN SET (0.000 sec)
 
SELECT LPAD(@rownum := @rownum + 1, 8, 0) AS str_num
     , t1.t1_seq
     , t2.t2_seq
     , t1.c1
     , t2.c2
  FROM t1 
       INNER JOIN t2 ON (t1.t1_seq = t2.t1_seq)
       CROSS JOIN ( SELECT @rownum := 0 ) X
; 
 
+------+-------------+------------+--------+---------------+------+---------+------+------+-------------------------------------------------+
| id   | select_type | TABLE      | TYPE   | possible_keys | KEY  | key_len | ref  | ROWS | Extra                                           |
+------+-------------+------------+--------+---------------+------+---------+------+------+-------------------------------------------------+
|    1 | PRIMARY     | <derived2> | system | NULL          | NULL | NULL    | NULL | 1    |                                                 |
|    1 | PRIMARY     | t1         | ALL    | PRIMARY       | NULL | NULL    | NULL | 1    |                                                 |
|    1 | PRIMARY     | t2         | ALL    | NULL          | NULL | NULL    | NULL | 3    | USING WHERE; USING JOIN buffer (flat, BNL JOIN) |
|    2 | DERIVED     | NULL       | NULL   | NULL          | NULL | NULL    | NULL | NULL | NO TABLES used                                  |
+------+-------------+------------+--------+---------------+------+---------+------+------+-------------------------------------------------+
4 ROWS IN SET (0.000 sec)
 
+----------+--------+--------+----+--------+
| str_num  | t1_seq | t2_seq | c1 | c2     |
+----------+--------+--------+----+--------+
| 00000001 |      1 |      1 | A  | T2-1-1 |
| 00000002 |      1 |      2 | A  | T2-1-2 |
| 00000003 |      1 |      3 | A  | T2-1-3 |
+----------+--------+--------+----+--------+
3 ROWS IN SET (0.002 sec)

STRAIGHT_JOIN:

SELECT STRAIGHT_JOIN LPAD(@rownum := @rownum + 1, 8, 0) AS str_num
     , t1.t1_seq
     , t2.t2_seq
     , t1.c1
     , t2.c2
  FROM t1 
       INNER JOIN t2 ON (t1.t1_seq = t2.t1_seq)
       CROSS JOIN ( SELECT @rownum := 0 ) X
; 
 
+------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+
| id   | select_type | TABLE      | TYPE | possible_keys | KEY  | key_len | ref  | ROWS | Extra                                           |
+------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+
|    1 | PRIMARY     | t1         | ALL  | PRIMARY       | NULL | NULL    | NULL | 1    |                                                 |
|    1 | PRIMARY     | t2         | ALL  | NULL          | NULL | NULL    | NULL | 3    | USING WHERE; USING JOIN buffer (flat, BNL JOIN) |
|    1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL | 1    | USING JOIN buffer (incremental, BNL JOIN)       |
|    2 | DERIVED     | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL | NO TABLES used                                  |
+------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------------+
4 ROWS IN SET (0.000 sec)
 
MariaDB [lololdb]> 
 
+----------+--------+--------+----+--------+
| str_num  | t1_seq | t2_seq | c1 | c2     |
+----------+--------+--------+----+--------+
| 00000001 |      1 |      1 | A  | T2-1-1 |
| 00000002 |      1 |      2 | A  | T2-1-2 |
| 00000003 |      1 |      3 | A  | T2-1-3 |
| 00000004 |      1 |      1 | A  | T2-1-1 |
| 00000005 |      1 |      2 | A  | T2-1-2 |
| 00000006 |      1 |      3 | A  | T2-1-3 |
+----------+--------+--------+----+--------+
6 ROWS IN SET (0.001 sec)
 
MariaDB [testdb]> 



 Comments   
Comment by Alice Sherepa [ 2023-04-03 ]

Thank you for the report! I repeated as described on 10.4-10.11, Myisam/InnoDB:

MariaDB [test]> SELECT STRAIGHT_JOIN * FROM t1 JOIN t2 ON (t1.t1_seq = t2.t1_seq) JOIN  (SELECT @a := 0) x;
+--------+----+--------+--------+--------+---------+
| t1_seq | c1 | t2_seq | t1_seq | c2     | @a := 0 |
+--------+----+--------+--------+--------+---------+
|      1 | A  |      1 |      1 | T2-1-1 |       0 |
|      1 | A  |      2 |      1 | T2-1-2 |       0 |
|      1 | A  |      3 |      1 | T2-1-3 |       0 |
|      1 | A  |      1 |      1 | T2-1-1 |       0 |
|      1 | A  |      2 |      1 | T2-1-2 |       0 |
|      1 | A  |      3 |      1 | T2-1-3 |       0 |
+--------+----+--------+--------+--------+---------+
6 rows in set (0.001 sec)
 
MariaDB [test]> SELECT  * FROM t1 JOIN t2 ON (t1.t1_seq = t2.t1_seq) JOIN  (SELECT @a := 0) x;
+--------+----+--------+--------+--------+---------+
| t1_seq | c1 | t2_seq | t1_seq | c2     | @a := 0 |
+--------+----+--------+--------+--------+---------+
|      1 | A  |      1 |      1 | T2-1-1 |       0 |
|      1 | A  |      2 |      1 | T2-1-2 |       0 |
|      1 | A  |      3 |      1 | T2-1-3 |       0 |
+--------+----+--------+--------+--------+---------+
3 rows in set (0.001 sec)

Comment by Dave Gosselin [ 2024-01-11 ]

Any line number references below are with respect to 10.4 at git sha 88c46aba753c00.

The apparent root of this issue is that we attempt to fill a derived table more than once, but without clearing its rows, so it consequently has duplicate rows. It doesn't matter if we enable join cache or not, or if we enable condition pushdown for derived optimization or not; the results are the same. The queries on this ticket may be simplified to "SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @a := 0) x;" to observe the result of this problem. This query has a derived table for "(SELECT @a := 0) x" which is backed by a temporary table and populated during mysql_derived_fill. The fact that this derived table sets a session variable causes us to mark the corresponding LEX as UNCACHEABLE_SIDEEFFECT. This is important later. Below I've included two stack traces from the execution of "SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @a := 0) x;" to show that we enter mysql_derived_fill twice (for simplicity of stack traces, both join cache and pushdown condition are disabled). During the "First Time", unit->executed is false and unit->uncacheable is 4 (sideeffect) and all is OK: we populate the derived table. But during the "Second Time", unit->executed is true and unit->uncacheable is still 4, so we again populate the table. During today's SQL Processor call meeting, we discussed several possible fixes. Since mysql_derived_fill will, for UNCACHEABLE_DEPENDENT tables, drop all rows and repopulate, we thought it reasonable to relax the condition at line 1204. Rather than !unit->uncacheable, we consider !(unit->uncacheable & UNCACHEABLE_DEPENDENT) to be more correct because we only populate derived tables once unless they're dependent tables (subquery has fields from outer query). Indeed such a change passes all of our mtr tests, but would like an opinion from igor

First Time
#0 mysql_derived_fill (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1204
#1 0x0000555555d7c216 in mysql_derived_optimize (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1049
#2 0x0000555555d79bd1 in mysql_handle_single_derived (lex=0x7fff68004bf8, derived=0x7fff680175d8, phases=4) at /nvme/1/server/sql/sql_derived.cc:200
#3 0x0000555555f26bd0 in TABLE_LIST::handle_derived (this=0x7fff680175d8, lex=0x7fff68004bf8, phases=4) at /nvme/1/server/sql/table.cc:9090
#4 0x0000555555d96039 in LEX::handle_list_of_derived (this=0x7fff68004bf8, table_list=0x7fff68015f50, phases=4) at /nvme/1/server/sql/sql_lex.h:4455
#5 0x0000555555da2c96 in st_select_lex::handle_derived (this=0x7fff68015920, lex=0x7fff68004bf8, phases=4) at /nvme/1/server/sql/sql_lex.cc:4417
#6 0x0000555555e268e4 in JOIN::optimize_inner (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:2229
#7 0x0000555555e24ad4 in JOIN::optimize (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:1731
#8 0x0000555555e30459 in mysql_select (thd=0x7fff68000d90, tables=0x7fff68015f50, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748610, result=0x7fff68018d88, unit=0x7fff68004cb8, select_lex=0x7fff68015920) at /nvme/1/server/sql/sql_select.cc:4832

Second Time
#0 mysql_derived_fill (thd=0x7fff68000d90, lex=0x7fff68004bf8, derived=0x7fff680175d8) at /nvme/1/server/sql/sql_derived.cc:1204
#1 0x0000555555d79bd1 in mysql_handle_single_derived (lex=0x7fff68004bf8, derived=0x7fff680175d8, phases=96) at /nvme/1/server/sql/sql_derived.cc:200
#2 0x0000555555e4a152 in st_join_table::preread_init (this=0x7fff68096918) at /nvme/1/server/sql/sql_select.cc:13886
#3 0x0000555555e5cb27 in sub_select (join=0x7fff68018db0, join_tab=0x7fff68096918, end_of_records=false) at /nvme/1/server/sql/sql_select.cc:20884
#4 0x0000555555e5d56e in evaluate_join_record (join=0x7fff68018db0, join_tab=0x7fff68096568, error=0) at /nvme/1/server/sql/sql_select.cc:21149
#5 0x0000555555e5cd79 in sub_select (join=0x7fff68018db0, join_tab=0x7fff68096568, end_of_records=false) at /nvme/1/server/sql/sql_select.cc:20922
#6 0x0000555555e5c051 in do_select (join=0x7fff68018db0, procedure=0x0) at /nvme/1/server/sql/sql_select.cc:20443
#7 0x0000555555e2fc4a in JOIN::exec_inner (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:4625
#8 0x0000555555e2ed1e in JOIN::exec (this=0x7fff68018db0) at /nvme/1/server/sql/sql_select.cc:4407
#9 0x0000555555e304f2 in mysql_select (thd=0x7fff68000d90, tables=0x7fff68015f50, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748610, result=0x7fff68018d88, unit=0x7fff68004cb8, select_lex=0x7fff68015920) at /nvme/1/server/sql/sql_select.cc:4846

Comment by Igor Babaev [ 2024-01-11 ]

Gosselin: yes, your suggested change would be appropriate.

Comment by Dave Gosselin [ 2024-01-11 ]

PR: https://github.com/MariaDB/server/pull/2999

Generated at Thu Feb 08 10:20:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.