[MDEV-389] Wrong result (missing row) with semijoin, join_cache_level>4, LEFT JOIN, GROUP BY, ORDER BY, LIMIT Created: 2012-07-11  Updated: 2013-05-04  Resolved: 2013-05-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.25, 5.3.7
Fix Version/s: 5.5.31

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None


 Description   

The following query

SELECT a, COUNT(*) AS cnt 
FROM t1 LEFT JOIN t2 ON (d = b) 
WHERE a IN ( SELECT c FROM t2 WHERE b > 'k' ) 
GROUP BY a ORDER BY a, cnt LIMIT 2

on the test data returns 1 row if it's run with semijoin and join_cache_level > 4, and 2 rows otherwise. 2 rows is the correct result.

bzr version-info

revision-id: psergey@askmonty.org-20120705103901-9ero4x0eifk0wtks
date: 2012-07-05 14:39:01 +0400
build-date: 2012-07-11 01:54:42 +0400
revno: 3462

Also reproducible on maria/5.3 revno 3551, and on release builds 5.3.7 and 5.5.25.

Minimal optimizer_switch:

semijoin=on

Full optimizer_switch (default):

index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off

EXPLAIN (with the default optimizer_switch):

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	t1	index	a	a	4	NULL	3	100.00	Using where; Using temporary; Using filesort
1	PRIMARY	t2	ref	c	c	5	test.t1.a	2	100.00	Using index; Start temporary; End temporary
1	PRIMARY	t2	ref	d	d	2	test.t1.b	2	100.00	Using where; Using index
Warnings:
Note	1276	Field or reference 'test.t1.b' of SELECT #2 was resolved in SELECT #1
Note	1003	select `test`.`t1`.`a` AS `a`,count(0) AS `cnt` from `test`.`t1` semi join (`test`.`t2`) left join `test`.`t2` on(((`test`.`t2`.`d` = `test`.`t1`.`b`) and (`test`.`t1`.`b` is not null))) where ((`test`.`t2`.`c` = `test`.`t1`.`a`) and (`test`.`t1`.`b` > 'k')) group by `test`.`t1`.`a` order by `test`.`t1`.`a`,count(0) limit 2

Test case:

 
SET join_cache_level = 5;
SET optimizer_switch = 'semijoin=on';
 
CREATE TABLE t1 (a INT NOT NULL, b CHAR(1), KEY(a)) ENGINE=MyISAM;
INSERT INTO t1 VALUES (4,'p'),(1,'q'),(8,'e');
 
CREATE TABLE t2 (c INT, d CHAR(1), KEY(c), KEY(d)) ENGINE=MyISAM;
INSERT INTO t2 VALUES (4,'f'),(2,'i'),(5,'h'),(3,'q'),(1,'g');
 
SELECT a, COUNT(*) AS cnt 
FROM t1 LEFT JOIN t2 ON (d = b) 
WHERE a IN ( SELECT c FROM t2 WHERE b > 'k' ) 
GROUP BY a ORDER BY a, cnt LIMIT 2;

Expected result:

a	cnt
1	1
4	1

Actual result:

a	cnt
1	1

Please note that the aggregate function is not required, it just makes the test somewhat more sensible. The query can be modified like this (double a in ORDER BY is not a typo, it is needed for this version of the query):

SELECT a 
FROM t1 LEFT JOIN t2 ON (d = b) 
WHERE a IN ( SELECT c FROM t2 WHERE b > 'k' ) 
GROUP BY a ORDER BY a, a LIMIT 2;

It still returns 1 row instead of 2.

Note: I've set it to Minor, because join_cache_level > 4 is not default



 Comments   
Comment by Patryk Pomykalski [ 2013-02-26 ]

Can't repeat with latest 5.5 (rev 3667)

Comment by Sergei Petrunia [ 2013-03-09 ]

In 5.3, the bug is removed by this revision:

revno: 3592 [merge]
revision-id: timour@askmonty.org-20121109081120-vdtnonh1u7wpl8ms
parent: igor@askmonty.org-20121102225916-g7s6hod36vdqckne
parent: timour@askmonty.org-20121108202605-r3z07g8atnaqkw9v
committer: timour@askmonty.org
branch nick: 5.3-merge-new
timestamp: Fri 2012-11-09 10:11:20 +0200
message:
Merge MariaDB 5.1.66 -> 5.2 -> 5.3

I am unable to tell which particular patch caused it to be fixed.

Comment by Sergei Petrunia [ 2013-03-09 ]

In 5.3, the change is because of a different query plan:

rev-3592:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 ALL a NULL NULL NULL 3 Using where; Using temporary; Using filesort
1 PRIMARY t2 ref c c 5 test.t1.a 2 Using index; Start temporary; End temporary
1 PRIMARY t2 ref d d 2 test.t1.b 2 Using where; Using index

rev-3591:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index a a 4 NULL 3 Using where; Using temporary; Using filesort
1 PRIMARY t2 ref c c 5 test.t1.a 2 Using index; Start temporary; End temporary
1 PRIMARY t2 ref d d 2 test.t1.b 2 Using where; Using index

maybe, the bug is still there.

Comment by Sergei Petrunia [ 2013-05-04 ]

Not repeatable in the latest 5.5 (revno: 3742).

Comment by Sergei Petrunia [ 2013-05-04 ]

For now, added testcase. Feel free to re-open if the problem repeats again

Generated at Thu Feb 08 06:28:23 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.