Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3873

Wrong result (extra rows) with NOT IN and a subquery from a MERGE view

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.0, 5.5.28, 5.3.10
    • 10.0.1, 5.5.29, 5.3.11
    • None
    • None

    Description

      The test case below returns different results when a query is run with a MERGE view comparing to a TEMPTABLE view or a base table. I believe that the TEMPTABLE/base table result is correct, even although the result with the MERGE view might look more "natural".

      The difference seems fresh, it appeared in maria/5.3 tree since revno 3592.

      Test case:

      CREATE TABLE t1 (a INT NOT NULL) ENGINE=MyISAM;
      INSERT INTO t1 VALUES (4),(7),(0);
       
      CREATE TABLE t2 (b INT NOT NULL) ENGINE=MyISAM;
      INSERT INTO t2 VALUES (1),(2);
       
      CREATE TABLE t3 (c INT NOT NULL) ENGINE=MyISAM;
      INSERT INTO t3 VALUES (4),(6),(3);
       
      CREATE TABLE t4 (d INT NOT NULL) ENGINE=MyISAM;
      INSERT INTO t4 VALUES (4),(5),(3);
       
      CREATE TABLE tv (e INT NOT NULL) ENGINE=MyISAM;
      INSERT INTO tv VALUES (1),(3);
       
      CREATE ALGORITHM=TEMPTABLE VIEW v_temptable AS SELECT * FROM tv;
      CREATE ALGORITHM=MERGE VIEW v_merge AS SELECT * FROM tv;
       
      SELECT * FROM t1, t2 
      WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN v_temptable ON (c = e) WHERE c <> b ) AND a < b;
       
      SELECT * FROM t1, t2 
      WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN v_merge ON (c = e) WHERE c <> b ) AND a < b;

      Results from maria/5.3 revno 3599:

      WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN v_temptable ON (c = e) WHERE c <> b ) AND a < b;
      a	b
      SELECT * FROM t1, t2 
      WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN v_merge ON (c = e) WHERE c <> b ) AND a < b;
      a	b
      0	1
      0	2

      Default optimizer_switch:

      index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

      EXPLAINs (with the default optimizer switch):

      EXPLAIN EXTENDED
      SELECT * FROM t1, t2 
      WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN v_temptable ON (c = e) WHERE c <> b ) AND a < b;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	PRIMARY	t2	ALL	NULL	NULL	NULL	NULL	2	100.00	
      1	PRIMARY	t1	ALL	NULL	NULL	NULL	NULL	3	100.00	Using where; Using join buffer (flat, BNL join)
      2	DEPENDENT SUBQUERY	t3	ALL	NULL	NULL	NULL	NULL	3100.00	Using where
      2	DEPENDENT SUBQUERY	<derived3>	ref	key0	key0	4	test.t3.c	2	100.00	Using where
      3	DERIVED	tv	ALL	NULL	NULL	NULL	NULL	2	100.00	
      Warnings:
      Note	1276	Field or reference 'test.t2.b' of SELECT #2 was resolved in SELECT #1
      Note	1003	select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t1` join `test`.`t2` where ((not(<expr_cache><`test`.`t1`.`a`,`test`.`t2`.`b`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(select `v_temptable`.`e` from `test`.`t3` left join `test`.`v_temptable` on((`v_temptable`.`e` = `test`.`t3`.`c`)) where ((`test`.`t3`.`c` <> `test`.`t2`.`b`) and ((<cache>(`test`.`t1`.`a`) = `v_temptable`.`e`) or isnull(`v_temptable`.`e`))) having <is_not_null_test>(`v_temptable`.`e`)))))) and (`test`.`t1`.`a` < `test`.`t2`.`b`))

      EXPLAIN EXTENDED
      SELECT * FROM t1, t2 
      WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN v_merge ON (c = e) WHERE c <> b ) AND a < b;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	PRIMARY	t2	ALL	NULL	NULL	NULL	NULL	2	100.00	
      1	PRIMARY	t1	ALL	NULL	NULL	NULL	NULL	3	100.00	Using where; Using join buffer (flat, BNL join)
      2	DEPENDENT SUBQUERY	t3	ALL	NULL	NULL	NULL	NULL	3100.00	Using where
      2	DEPENDENT SUBQUERY	tv	ALL	NULL	NULL	NULL	NULL	2100.00	Using where; Using join buffer (flat, BNL join)
      Warnings:
      Note	1276	Field or reference 'test.t2.b' of SELECT #2 was resolved in SELECT #1
      Note	1003	select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t1` join `test`.`t2` where ((not(<expr_cache><`test`.`t1`.`a`,`test`.`t2`.`b`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(select `test`.`tv`.`e` from `test`.`t3` left join (`test`.`tv`) on((`test`.`tv`.`e` = `test`.`t3`.`c`)) where ((`test`.`t3`.`c` <> `test`.`t2`.`b`) and (<cache>(`test`.`t1`.`a`) = `test`.`tv`.`e`))))))) and (`test`.`t1`.`a` < `test`.`t2`.`b`))

      Attachments

        Issue Links

          Activity

            While the problem is not related to the change introduced in MDEV-3862, but it's going to affect testing, so it would be good to get it fixed.

            elenst Elena Stepanova added a comment - While the problem is not related to the change introduced in MDEV-3862 , but it's going to affect testing, so it would be good to get it fixed.

            Correct explain for substituted view should be like this:
            explain extended
            SELECT * FROM t1, t2
            WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN tv ON (c = e) WHERE c <> b ) AND a < b;
            id select_type table type possible_keys key key_len ref rows filtered Extra
            1 PRIMARY t2 ALL NULL NULL NULL NULL 2 100.00
            1 PRIMARY t1 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer (flat, BNL join)
            2 DEPENDENT SUBQUERY t3 ALL NULL NULL NULL NULL 3100.00 Using where
            2 DEPENDENT SUBQUERY tv ALL NULL NULL NULL NULL 2100.00 Using where; Using join buffer (flat, BNL join)
            Warnings:
            Note 1276 Field or reference 'test.t2.b' of SELECT #2 was resolved in SELECT #1
            Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t1` join `test`.`t2` where ((not(<expr_cache><`test`.`t1`.`a`,`test`.`t2`.`b`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(select `test`.`tv`.`e` from `test`.`t3` left join `test`.`tv` on((`test`.`tv`.`e` = `test`.`t3`.`c`)) where ((`test`.`t3`.`c` <> `test`.`t2`.`b`) and ((<cache>(`test`.`t1`.`a`) = `test`.`tv`.`e`) or isnull(`test`.`tv`.`e`))) having <is_not_null_test>(`test`.`tv`.`e`)))))) and (`test`.`t1`.`a` < `test`.`t2`.`b`))

            i.e. possibility of NULLs in left join processed incorrectly (was not detected).

            sanja Oleksandr Byelkin added a comment - Correct explain for substituted view should be like this: explain extended SELECT * FROM t1, t2 WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN tv ON (c = e) WHERE c <> b ) AND a < b; id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY t2 ALL NULL NULL NULL NULL 2 100.00 1 PRIMARY t1 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer (flat, BNL join) 2 DEPENDENT SUBQUERY t3 ALL NULL NULL NULL NULL 3100.00 Using where 2 DEPENDENT SUBQUERY tv ALL NULL NULL NULL NULL 2100.00 Using where; Using join buffer (flat, BNL join) Warnings: Note 1276 Field or reference 'test.t2.b' of SELECT #2 was resolved in SELECT #1 Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t1` join `test`.`t2` where ((not(<expr_cache><`test`.`t1`.`a`,`test`.`t2`.`b`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(select `test`.`tv`.`e` from `test`.`t3` left join `test`.`tv` on((`test`.`tv`.`e` = `test`.`t3`.`c`)) where ((`test`.`t3`.`c` <> `test`.`t2`.`b`) and ((<cache>(`test`.`t1`.`a`) = `test`.`tv`.`e`) or isnull(`test`.`tv`.`e`))) having <is_not_null_test>(`test`.`tv`.`e`)))))) and (`test`.`t1`.`a` < `test`.`t2`.`b`)) i.e. possibility of NULLs in left join processed incorrectly (was not detected).

            In case of view TABLE_LIST::outer_join of merged table (tv in the example) is not set, so TABLE::maybe_null al;so is not set in setup_table_map()

            sanja Oleksandr Byelkin added a comment - In case of view TABLE_LIST::outer_join of merged table (tv in the example) is not set, so TABLE::maybe_null al;so is not set in setup_table_map()

            The question is: "Why merge algorithm for view/derived does not set embedding?"

            sanja Oleksandr Byelkin added a comment - The question is: "Why merge algorithm for view/derived does not set embedding?"

            Apparently the merge algorithm should have set embedding. I would say this is a bug if it does not do it.

            igor Igor Babaev (Inactive) added a comment - Apparently the merge algorithm should have set embedding. I would say this is a bug if it does not do it.

            the problem is that TABLE_LIST::wrap_into_nested_join sets embedding after setup_table_map where embedding is checked

            sanja Oleksandr Byelkin added a comment - the problem is that TABLE_LIST::wrap_into_nested_join sets embedding after setup_table_map where embedding is checked

            I have to talk With Igor about it.

            sanja Oleksandr Byelkin added a comment - I have to talk With Igor about it.

            According to the advice I'll try to re-cqalculate TABLE::maybe_null after TABLE_LIST::wrap_into_nested_join calls...

            sanja Oleksandr Byelkin added a comment - According to the advice I'll try to re-cqalculate TABLE::maybe_null after TABLE_LIST::wrap_into_nested_join calls...

            mysql_derived_merge() sets table->maybe_null but it is too late...

            sanja Oleksandr Byelkin added a comment - mysql_derived_merge() sets table->maybe_null but it is too late...

            just for note, derived works also incorrectly:
            SELECT * FROM t1, t2
            WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN (SELECT * FROM tv) as derived ON (c = e) WHERE c <> b ) AND a < b;
            a b
            0 1
            0 2

            sanja Oleksandr Byelkin added a comment - just for note, derived works also incorrectly: SELECT * FROM t1, t2 WHERE a NOT IN ( SELECT e FROM t3 LEFT JOIN (SELECT * FROM tv) as derived ON (c = e) WHERE c <> b ) AND a < b; a b 0 1 0 2

            Commited for review.

            sanja Oleksandr Byelkin added a comment - Commited for review.

            Pushed fix for 5.5.

            sanja Oleksandr Byelkin added a comment - Pushed fix for 5.5.

            Fixed in 5.5 (typo & test suite)

            sanja Oleksandr Byelkin added a comment - Fixed in 5.5 (typo & test suite)

            only 10.0 left waiting to merge

            sanja Oleksandr Byelkin added a comment - only 10.0 left waiting to merge

            People

              sanja Oleksandr Byelkin
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.