[MDEV-2608] LP:795530 - Wrong result with subquery semijoin materialization and outer join Created: 2011-06-10  Updated: 2015-04-16  Resolved: 2012-10-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.3.12
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Timour Katchaounov (Inactive) Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug795530.xml    

 Description   

The following query over the 'world' database incorrectly produces wrong result
of 1 row instead of an empty result:

create database world;
use world;
source mysql-test/include/world_schema.inc
source mysql-test/include/world.inc

set @@optimizer_switch='materialization=on,semijoin=on';

MariaDB [world]> select * from CountryLanguage where (Language, Country) IN (SELECT City.Name, Country.Code FROM City LEFT JOIN Country ON (Country = Code and Code > 'LLL'));
---------------------------

Country Language Percentage

---------------------------

KEN Meru 5.5

---------------------------
1 row in set (0.22 sec)

set @@optimizer_switch='materialization=on,semijoin=off';
=> empty result;



 Comments   
Comment by Timour Katchaounov (Inactive) [ 2011-06-10 ]

Re: Wrong result with subquery semijoin materialization and outer join
In addition, running the above query with semijoin materialization is twice slower
than with non-semijoin materialization.

set @@optimizer_switch='materialization=on,semijoin=off';

MariaDB [world]> select * from CountryLanguage where (Language, Country) IN (SELECT City.Name, Country.Code FROM City LEFT JOIN Country ON (Country = Code and Code > 'LLL'));

Empty set (0.10 sec)

Notice 0.1 sec vs 0.22 sec, averaged over several executions.

Comment by Sergei Petrunia [ 2011-06-11 ]

Re: Wrong result with subquery semijoin materialization and outer join
A smaller testcase:

create table t1 (a int);
create table t2 (a int, b char(10));

insert into t1 values (1),(2);
insert into t2 values (1, 'one'), (3, 'three');

create table t3 (b char(10));
insert into t3 values('three'),( 'four');
insert into t3 values('three'),( 'four');
insert into t3 values('three'),( 'four');
insert into t3 values('three'),( 'four');

MariaDB [j48]> set optimizer_switch='semijoin=off';
Query OK, 0 rows affected (0.00 sec)

MariaDB [j48]> select * from t3 where t3.b in (select t2.b from t1 left join t2 on t1.a=t2.a);
Empty set (0.01 sec)

MariaDB [j48]> set optimizer_switch='semijoin=on';
Query OK, 0 rows affected (0.00 sec)

MariaDB [j48]> select * from t3 where t3.b in (select t2.b from t1 left join t2 on t1.a=t2.a);
-------

b

-------

three
three
three
three

-------
4 rows in set (0.00 sec)

Comment by Sergei Petrunia [ 2011-06-11 ]

Re: Wrong result with subquery semijoin materialization and outer join
The problem seems to be in SJ-Merged materialization. For some reason, end_sj_materialize() fails to see that it is writing columns of NULL-comlemented rows, and writes some (previous?) non-NULL data instead.

Comment by Sergei Petrunia [ 2011-07-06 ]

Re: Wrong result with subquery semijoin materialization and outer join
The wrong query result problem has been fixed, and the fix has been pushed.

Comment by Sergei Petrunia [ 2011-07-06 ]

Re: Wrong result with subquery semijoin materialization and outer join
As for performance:

The difference between performance of semi-join and non-semi-join materialization plans is because
non-semijoin materialization does materialization+lookup (its only strategy):

----------------------------------------------------------------------------------------------------------+

id select_type table type possible_keys key key_len ref rows Extra

----------------------------------------------------------------------------------------------------------+

1 PRIMARY CountryLanguage ALL NULL NULL NULL NULL 984 Using where
2 SUBQUERY City ALL NULL NULL NULL NULL 4079  
2 SUBQUERY Country eq_ref PRIMARY PRIMARY 3 j4.City.Country 1 Using where; Using index

----------------------------------------------------------------------------------------------------------+

while semi-join materialization does materialization+scan

-----------------------------------------------------------------------------------------------------------------------+

id select_type table type possible_keys key key_len ref rows Extra

-----------------------------------------------------------------------------------------------------------------------+

1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 4079  
1 PRIMARY CountryLanguage eq_ref PRIMARY PRIMARY 33 j4.Country.Code,j4.City.Name 1 Using index condition
2 SUBQUERY City ALL NULL NULL NULL NULL 4079  
2 SUBQUERY Country eq_ref PRIMARY PRIMARY 3 j4.City.Country 1 Using where; Using index

-----------------------------------------------------------------------------------------------------------------------+

Comment by Sergei Petrunia [ 2011-07-07 ]

Re: Wrong result with subquery semijoin materialization and outer join
I'm getting the following average query times (in seconds):
semi-join materialization, mean: 0.1042105
non-semi-join materialzation, mean: 0.1852632

Comment by Sergei Petrunia [ 2011-07-07 ]

Re: Wrong result with subquery semijoin materialization and outer join
The slowdown will be addressed separately here: https://bugs.launchpad.net/maria/+bug/806894

Comment by Rasmus Johansson (Inactive) [ 2011-12-13 ]

Launchpad bug id: 795530

Generated at Thu Feb 08 06:43:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.