[MDEV-5056] Wrong result (extra rows) with materialization+semijoin, IN subqueries Created: 2013-09-23  Updated: 2013-11-13  Resolved: 2013-11-13

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.4, 5.3.12, 5.5.33a
Fix Version/s: 5.5.34, 10.0.6

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates

 Description   

The problem appeared on 5.3 tree with the following revision (long ago):

revno: 3491 [merge]
revision-id: psergey@askmonty.org-20120404083746-oq412y6dys2yepw8
committer: Sergey Petrunya <psergey@askmonty.org>
timestamp: Wed 2012-04-04 12:37:46 +0400
message:
  Merge
    ------------------------------------------------------------
    revno: 3488.1.2
    revision-id: psergey@askmonty.org-20120404082636-pscsfp2vak8h0ekc
    committer: Sergey Petrunya <psergey@askmonty.org>
    timestamp: Wed 2012-04-04 12:26:36 +0400
    message:
      BUG#913030: better comments and function names.
    ------------------------------------------------------------
    revno: 3488.1.1 [merge]
    revision-id: psergey@askmonty.org-20120402174731-hzq5kw3strd0e8rv
    committer: Sergey Petrunya <psergey@askmonty.org>
    timestamp: Mon 2012-04-02 21:47:31 +0400
    message:
      Merge
        ------------------------------------------------------------
        revno: 3479.1.1
        revision-id: psergey@askmonty.org-20120402174154-8y0lzcwc0qycoj3n
        committer: Sergey Petrunya <psergey@askmonty.org>
        timestamp: Mon 2012-04-02 21:41:54 +0400
        message:
          BUG#913030: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3
          - When doing join optimization, pre-sort the tables so that they mimic the execution
            order we've had with 'semijoin=off'.
          - That way, we will not get regressions when there are two query plans (the old and the
            new) that have indentical costs but different execution times (because of factors that
            the optimizer was not able to take into account).

Test case:

SET optimizer_switch = 'materialization=on,semijoin=on';
SET join_cache_level = 2;
 
CREATE TABLE t1 ( c1 VARCHAR(2), c2 VARCHAR(2), INDEX(c1) ) ENGINE=MyISAM;
INSERT INTO t1 VALUES 
('JP','OM'),('VA','JP'),('CA','ML'),('ML','EG'),('DK','CA'),
('DK','QA'),('YE','PL'),('TR','ZW'),('DK','SK'),('SK','DK'),
('RO','ML'),('ML','BG'),('BG','ZW'),('ZW','GE'),('GE','JP'),
('PL','EG'),('QA','YE'),('WF','DK'),('DK','JP'),('EG','OM');
 
CREATE TABLE t2 ( c3 VARCHAR(2), c4 VARCHAR(2) ) ENGINE=MyISAM;
INSERT INTO t2 VALUES ('CA','ML'),('IN','HU'),('HU','IN');
 
SELECT * FROM t1 AS alias1, t1 AS alias2 
WHERE ( alias2.c2, alias1.c1 ) IN ( SELECT c4, c3 FROM t2 ) AND alias1.c1 IN ( SELECT c2 FROM t1 );

Actual result:

c1	c2	c1	c2
JP	OM	CA	ML
CA	ML	CA	ML
ML	EG	CA	ML
DK	CA	CA	ML
DK	QA	CA	ML
YE	PL	CA	ML
DK	SK	CA	ML
SK	DK	CA	ML
ML	BG	CA	ML
BG	ZW	CA	ML
ZW	GE	CA	ML
GE	JP	CA	ML
PL	EG	CA	ML
QA	YE	CA	ML
DK	JP	CA	ML
EG	OM	CA	ML
JP	OM	RO	ML
CA	ML	RO	ML
ML	EG	RO	ML
DK	CA	RO	ML
DK	QA	RO	ML
YE	PL	RO	ML
DK	SK	RO	ML
SK	DK	RO	ML
ML	BG	RO	ML
BG	ZW	RO	ML
ZW	GE	RO	ML
GE	JP	RO	ML
PL	EG	RO	ML
QA	YE	RO	ML
DK	JP	RO	ML
EG	OM	RO	ML

Expected result:

c1	c2	c1	c2
CA	ML	CA	ML
CA	ML	RO	ML



 Comments   
Comment by Elena Stepanova [ 2013-09-23 ]

Set to Major because it's a not-very-complicated test case with a wrong result, but it's not as urgent as other matters, it's been there for long time.

Comment by Igor Babaev [ 2013-10-19 ]

The following patch fixes the reporting test case without breaking any other tests from the mysql test suite:

=== modified file 'sql/sql_select.cc'
— sql/sql_select.cc 2013-10-11 12:57:19 +0000
+++ sql/sql_select.cc 2013-10-19 03:28:19 +0000
@@ -11969,7 +11969,9 @@ Item *eliminate_item_equal(COND *cond, C
In other cases, get the "head" item, which is either first of the
equals on top level, or the constant.
*/

  • Item *head_item= (!item_const && current_sjm)? current_sjm_head: head;
    + Item *head_item= (!item_const && current_sjm &&
    + current_sjm_head != field_item) ?
    + current_sjm_head: head;
    Item *head_real_item= head_item->real_item();
    if (head_real_item->type() == Item::FIELD_ITEM)
    head_item= head_real_item;

Sergey,
What do you think about this patch?

Comment by Sergei Petrunia [ 2013-11-13 ]

The patch is correct. I've made the fix from it.

Generated at Thu Feb 08 07:01:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.