[MDEV-3335] LP:849776 - Wrong result with semijoin + "Impossible where" Created: 2011-09-14  Updated: 2015-02-02  Resolved: 2012-10-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Philip Stoev (Inactive) Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug849776.xml    

 Description   

The following query:

select * from t5 where (a) in (
SELECT t1.a FROM t1
LEFT JOIN t2 ON ( t1.a = t2.a ) WHERE t2.b NOT IN ( SELECT t4.b FROM t4 WHERE t4.b < t1.b ));

returns no rows when executed with semijoin, even though the correct result must be "8", since table t5 contains the result from the subquery.

mysql 5.5 also wrongly returns an empty result, though the plan there is different. Postgresql confirms that the correct result is "8".

explain:

1 PRIMARY NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables
3 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables

minimal optimizer_switch: semijoin=on
full optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

bzr version-info:

revision-id: <email address hidden>
date: 2011-09-10 18:01:27 +0300
build-date: 2011-09-14 11:28:19 +0300
revno: 3183
branch-nick: maria-5.3

test case:

CREATE TABLE t1 ( b varchar(1), a integer) ;
INSERT INTO t1 VALUES ('z',8);

CREATE TABLE t2 ( a integer, b varchar(1)) ;

CREATE TABLE t4 ( a integer, b varchar(1)) ;

CREATE TABLE t5 ( a integer) ;
INSERT INTO t5 VALUES (8);

set session optimizer_switch='semijoin=on';
select * from t5 where (a) in (
SELECT t1.a FROM t1
LEFT JOIN t2 ON ( t1.a = t2.a ) WHERE t2.b NOT IN ( SELECT t4.b FROM t4 WHERE t4.b < t1.b ));



 Comments   
Comment by Sergei Petrunia [ 2011-09-20 ]

Re: Wrong result with semijoin + "Impossible where"
The wrong result is produced because outer join is converted into inner, which is wrong. The rewritten subquery looks like this:

Message: select 8 AS `a` from `bug849776`.`t5` semi join (`bug849776`.`t1` join `bug849776`.`t2`) where 0

the outer-to-inner conversion is made by simplify_joins(). I don't yet understand why this happens.

Comment by Sergei Petrunia [ 2011-09-21 ]

Re: Wrong result with semijoin + "Impossible where"
(gdb) wher 4
#0 simplify_joins (join=0xa679d00, join_list=0xa65f188, conds=0xa679000, top=true, in_sj=true) at sql_select.cc:11892
#1 0x0835d5e8 in simplify_joins (join=0xa679d00, join_list=0xa678ce8, conds=0xa679000, top=true, in_sj=true) at sql_select.cc:11857
#2 0x0835d5e8 in simplify_joins (join=0xa679d00, join_list=0xa5fdd2c, conds=0xa679000, top=true, in_sj=false) at sql_select.cc:11857
#3 0x08371073 in JOIN::optimize (this=0xa679d00) at sql_select.cc:960
(More stack frames follow...)

(gdb) p (Item*)conds
$63 = (Item_cond_and *) 0xa679000
(gdb) p dbug_print_item(((Item*)conds)->list.nth_element(0))
$64 = 0x8b78000 "1"
(gdb) p dbug_print_item(((Item*)conds)->list.nth_element(1))
$65 = 0x8b78000 "(not(<in_optimizer>(`bug849776`.`t2`.`b`,<exists>(select `bug849776`.`t4`.`b` from `bug849776`.`t4` where (`bug849776`.`t4`.`b` < `bug849776`.`t1`.`b`)))))"
(gdb) p dbug_print_item(((Item*)conds)->list.nth_element(2))
$66 = 0x8b78000 "(`bug849776`.`t5`.`a` = `bug849776`.`t1`.`a`)"
(gdb) p dbug_print_item(((Item*)conds)->list.nth_element(3))
$67 = 0x88d9fc4 "(Item*)NULL"
(gdb) p/t conds->used_tables()
$68 = 111

That is, the condition conds has form:

1 AND t2.b NOT IN ( SELECT t4.b FROM t4 WHERE t4.b < t1.b ) AND t5.a=t1.a

not_null_tables() attribute seems to be wrong:

(gdb) p/t conds->not_null_tables()
$69 = 111

    1. All three tables are NOT-NULL, this is what causes the left join be
    2. converted into inner join

(gdb) p ((Item*)conds)>list.nth_element(0)>not_null_tables()
$71 = 0
(gdb) p ((Item*)conds)>list.nth_element(1)>not_null_tables()
$72 = 0
(gdb) p ((Item*)conds)>list.nth_element(2)>not_null_tables()
$73 = 3

  1. attributes of branches are correct. The problem is with Item_cond_and's
  2. attributes.
Comment by Sergei Petrunia [ 2011-09-21 ]

Re: Wrong result with semijoin + "Impossible where"
This is where the problem lies:

(gdb) wher 1
#0 Item_func::fix_after_pullout (this=0xa678838, new_parent=0xa5fdc68, ref=0xa65fc90) at item_func.cc:262
(More stack frames follow...)
(gdb) p this
$188 = (Item_in_optimizer *) 0xa678838
(gdb) list
257 (*arg)->fix_after_pullout(new_parent, arg);
258 Item *item= *arg;
259
260 used_tables_cache|= item->used_tables();
261 not_null_tables_cache|= item->not_null_tables();
262 const_item_cache&= item->const_item();
263 }
264 }
265 }
266
(gdb) p *arg
$189 = (Item_field *) 0xa65f200

Item_in_optimizer inherits fix_after_pullout() from Item_func::fix_after_pullout. The latter function assumes that NOT-null attributes of its arguments should be spread to itself, i.e.

argument->not_null_tables() & BIT ==> this->not_null_tables() & BIT

which is not true, particularly for IN that's inside NOT.

Comment by Rasmus Johansson (Inactive) [ 2011-12-13 ]

Launchpad bug id: 849776

Generated at Thu Feb 08 06:47:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.