[MDEV-3446] LP:901399 - Wrong result (extra row) with semijoin=ON, materialization=OFF, optimizer_prune_level=0 Created: 2011-12-07  Updated: 2015-02-02  Resolved: 2012-10-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug901399.xml    

 Description   

The following query

SELECT *
FROM t1 AS alias1, t1 AS alias2
WHERE ( alias1.c, alias2.c )
  IN (
      SELECT alias3.a, alias3.a
        FROM t2 AS alias3, t2 alias4
        WHERE alias3.b = alias4.b
     );

produces an extra row on test data if it's run with materialization=OFF and otherwise default optimizer_switch values, while it returns the correct result set with materialization=ON (or with semijoin=OFF, or with loosescan=OFF).

EXPLAIN with materialization=OFF (wrong result):

1 PRIMARY alias2 ALL NULL NULL NULL NULL 13 100.00
1 PRIMARY alias3 ALL a NULL NULL NULL 3 66.67 Using where; LooseScan
1 PRIMARY alias4 ALL NULL NULL NULL NULL 3 100.00 Using where; FirstMatch(alias3)
1 PRIMARY alias1 ALL NULL NULL NULL NULL 13 100.00 Using where; Using join buffer (flat, BNL join)

select `test`.`alias1`.`c` AS `c`,`test`.`alias2`.`c` AS `c` from `test`.`t1` `alias1` semi join (`test`.`t2` `alias3` join `test`.`t2` `alias4`) join `test`.`t1` `alias2` where ((`test`.`alias3`.`a` = `test`.`alias2`.`c`) and (`test`.`alias1`.`c` = `test`.`alias2`.`c`) and (`test`.`alias4`.`b` = `test`.`alias3`.`b`))

EXPLAIN with materialization=ON (correct result):

1 PRIMARY alias1 ALL NULL NULL NULL NULL 13 100.00
1 PRIMARY alias2 ALL NULL NULL NULL NULL 13 100.00 Using where; Using join buffer (flat, BNL join)
1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00
2 MATERIALIZED alias3 ALL a NULL NULL NULL 3 100.00
2 MATERIALIZED alias4 ALL NULL NULL NULL NULL 3 100.00 Using where

select `test`.`alias1`.`c` AS `c`,`test`.`alias2`.`c` AS `c` from `test`.`t1` `alias1` semi join (`test`.`t2` `alias3` join `test`.`t2` `alias4`) join `test`.`t1` `alias2` where ((`test`.`alias2`.`c` = `test`.`alias1`.`c`) and (`test`.`alias4`.`b` = `test`.`alias3`.`b`))

Minimal optimizer_switch: materialization=off

Full optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

Reproducible with MyISAM and InnoDB, but not Aria.
Further reducing the test data eliminates the problem as the plan is not triggered anymore.

Test case:

SET optimizer_prune_level=0;
SET optimizer_switch = 'materialization=off';

CREATE TABLE t1 ( c INT ) ENGINE=MyISAM;
INSERT INTO t1 VALUES
  (0),(1),(2),(3),(4),(5),
  (6),(7),(8),(9),(10),(11),(12);
CREATE TABLE t2 ( a INT, b INT, KEY(a)) ENGINE=MyISAM;
INSERT INTO t2 VALUES (3,20),(2,21),(3,22);

SELECT *
FROM t1 AS alias1, t1 AS alias2
WHERE ( alias1.c, alias2.c )
  IN (
      SELECT alias3.a, alias3.a
        FROM t2 AS alias3, t2 alias4
        WHERE alias3.b = alias4.b
     );

  1. End of test case
  1. Notes on result:
  1. The correct result is 2 rows
  2. c c
  3. 2 2
  4. 3 3
  1. The test produces 3 rows
  2. c c
  3. 2 2
  4. 3 3
  5. 3 3

bzr version-info
revision-id: <email address hidden>
date: 2011-12-06 13:42:18 -0800
build-date: 2011-12-08 00:34:55 +0300
revno: 3334
branch-nick: maria-5.3

Also reproducible on revno 3315



 Comments   
Comment by Sergei Petrunia [ 2011-12-07 ]

Re: Wrong result (extra row) with semijoin=ON, materialization=OFF, optimizer_prune_level=0
The EXPLAIN for wrong result mentions use of join buffer, but the bug can be observed without join buffer, too:
MariaDB [j2]> set join_cache_level=0;
MariaDB [j2]> SELECT * FROM t1 AS alias1, t1 AS alias2 WHERE ( alias1.c, alias2.c ) IN ( SELECT alias3.a, alias3.a FROM t2 AS alias3, t2 alias4 WHERE alias3.b = alias4.b );
----------+

c c

----------+

2 2
3 3
3 3

----------+
3 rows in set (0.00 sec)

Comment by Sergei Petrunia [ 2011-12-08 ]

Re: Wrong result (extra row) with semijoin=ON, materialization=OFF, optimizer_prune_level=0
> EXPLAIN with materialization=OFF (wrong result):
>
> 1 PRIMARY alias2 ALL NULL NULL NULL NULL 13 100.00
> 1 PRIMARY alias3 ALL a NULL NULL NULL 3 66.67 Using where; LooseScan
> 1 PRIMARY alias4 ALL NULL NULL NULL NULL 3 100.00 Using where; FirstMatch(alias3)
> 1 PRIMARY alias1 ALL NULL NULL NULL NULL 13 100.00 Using where; Using join buffer (flat, BNL join)

Here one can see the problem: table alias3 uses 'ALL' scan and LooseScan. This
is incorrect: LooseScan needs index-ordered read.

Comment by Sergei Petrunia [ 2011-12-08 ]

Re: Wrong result (extra row) with semijoin=ON, materialization=OFF, optimizer_prune_level=0
> EXPLAIN with materialization=OFF (wrong result):
>
> 1 PRIMARY alias2 ALL NULL NULL NULL NULL 13 100.00
> 1 PRIMARY alias3 ALL a NULL NULL NULL 3 66.67 Using where; LooseScan
> 1 PRIMARY alias4 ALL NULL NULL NULL NULL 3 100.00 Using where; FirstMatch(alias3)
> 1 PRIMARY alias1 ALL NULL NULL NULL NULL 13 100.00 Using where; Using join buffer (flat, BNL join)

Here one can see the problem: table alias3 uses 'ALL' scan and LooseScan. This
is incorrect: LooseScan needs index-ordered read.

Comment by Elena Stepanova [ 2012-03-17 ]

Re: Wrong result (extra row) with semijoin=ON, materialization=OFF, optimizer_prune_level=0
Fix released with 5.3.3-rc.

Comment by Rasmus Johansson (Inactive) [ 2012-03-18 ]

Launchpad bug id: 901399

Generated at Thu Feb 08 06:48:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.