[MDEV-6127] Incorrect Right Join execution Created: 2014-04-17  Updated: 2017-11-02  Resolved: 2017-11-02

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.1.67, 5.2.14, 5.3.12, 5.5.36, 5.5.37, 10.0.10
Fix Version/s: 5.5.56, 10.0.34, 10.1.29

Type: Bug Priority: Major
Reporter: Konstantin Assignee: Oleksandr Byelkin
Resolution: Duplicate Votes: 0
Labels: upstream
Environment:

Does not matter


Sprint: 5.5.44, 5.5.58, 10.1.29

 Description   

Hello all.
I've got a strange behavior of the queries with outer right join.
First let's create some tables, fill them with data and create a function.

CREATE TABLE `test1` (
  `id` INT NOT NULL,
  PRIMARY KEY (`id`));
 
insert into test1 values(1);
insert into test1 values(2);
insert into test1 values(3);
insert into test1 values(9);

CREATE TABLE `test2` (
  `id` INT NOT NULL,
  PRIMARY KEY (`id`));
 
insert into test2 values(1);
insert into test2 values(7);
insert into test2 values(8);
insert into test2 values(9);

CREATE TABLE `test3` (
  `id` INT NOT NULL,
  PRIMARY KEY (`id`));
 
insert into test3 values(1);
insert into test3 values(2);

CREATE TABLE `test4` (
  `id` INT NOT NULL,
  PRIMARY KEY (`id`));
 
 
insert into test4 values(2);
insert into test4 values(88);

DELIMITER $$
 
CREATE FUNCTION `foo` (var1 int )
RETURNS INTEGER
BEGIN
	
RETURN coalesce((select min(id) from test4 where id = var1), 1);
END$$
 
DELIMITER ;

Now let's try the query:

select test1.*, test2.*, test3.*
from test1
right join test2 ON test1.id = test2.id
right join test3 ON test3.id = foo(test1.id);

What do I get:

'1', '1', '1'
'9', '9', '1'
NULL, NULL, '2'

What do I expect:

1,	1,	1
Null,	7,	1
Null,	8,	1
9,	9,	1
Null,	Null,	2

Now look at this query:

select test1.*, test2.*, test3.*, foo(coalesce(test1.id)),  foo(test1.id)
from test1
right join test2 ON test1.id = test2.id
right join test3 ON test3.id = foo(coalesce(test1.id));

What do I expect (without 2 last columns):

1,	1,	1
Null,	7,	1
Null,	8,	1
9,	9,	1
Null,	Null,	2

I think that "test1.id" and "coalesce(test1.id)" are similar, yes? This means that both of queries must return a similar resultset. But the second query returns exactly what I expect.

I think the reason is in query optimizer. It converts outer join to inner join because of using "test1.id" in ON clause. In this case it is not a correct behavior. It does not assume that the function used in this expression may work with nulls well.
But coalesce (and case expression also) hints to optimizer dont do this conversion.



 Comments   
Comment by Elena Stepanova [ 2014-04-21 ]

Hi Konstantin,
Thank you for the bug report.


Reproducible on all of MySQL 5.1 - MySQL 5.7, MariaDB 5.1 - MariaDB 10.0 that I tried.
Does not depend on the optimizer_switch.
PostgreSQL returns identical results for both queries.

Comment by Oleksandr Byelkin [ 2014-05-30 ]

Expanding the function to an expression also returns correct resunt.

EXPLAIN of both queries looks identical.

Comment by Oleksandr Byelkin [ 2015-04-16 ]

test1.id+0 does not work so it is not "protection of transformation by the function.

Comment by Oleksandr Byelkin [ 2015-04-16 ]

it looks like ability to be NULL fix the result:

select test1.*, test2.*, test3.*
from test1
right join test2 ON test1.id = test2.id
right join test3 ON test3.id = foo(if(test1.id,test1.id,NULL));

works

Comment by Oleksandr Byelkin [ 2015-04-17 ]

Difference between correct and incorrect execution for sure is inside SP, fere how result of execution of function differ (first is incorrect, second is correct):

T@5    : | | query: select test1.*, test2.*, test3.*
T@5    : | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | ZZZ: val 2
T@5    : | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | ZZZ: val 2
T@5    : | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | query: select test1.*, test2.*, test3.*
T@5    : | | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | | | | | | | | | | | | | | ZZZ: val 1
T@5    : | | query: drop function foo

Comment by Oleksandr Byelkin [ 2015-04-17 ]

Probably the fact that test1.id can be NULL due to outer join is not taken into account

Comment by Oleksandr Byelkin [ 2015-04-24 ]

To avoid recreating test suite every time trees changed and cleaned:

CREATE TABLE `t1` (
  `id` INT NOT NULL,
  PRIMARY KEY (`id`));
 
insert into t1 values(1);
insert into t1 values(2);
insert into t1 values(3);
insert into t1 values(9);
CREATE TABLE `t2` (
  `id` INT NOT NULL,
  PRIMARY KEY (`id`));
 
insert into t2 values(1);
insert into t2 values(7);
insert into t2 values(8);
insert into t2 values(9);
CREATE TABLE `t3` (
  `id` INT NOT NULL,
  PRIMARY KEY (`id`));
 
insert into t3 values(1);
insert into t3 values(2);
CREATE TABLE `t4` (
  `id` INT NOT NULL,
  PRIMARY KEY (`id`));
 
 
insert into t4 values(2);
insert into t4 values(88);
DELIMITER |;
 
CREATE FUNCTION `f1` (var1 int )
RETURNS INTEGER
BEGIN
	
RETURN coalesce((select min(id) from t4 where id = var1), 1);
END|
 
DELIMITER ;|
 
select t1.*, t2.*, t3.*
from t1
right join t2 ON t1.id = t2.id
right join t3 ON t3.id = f1(t1.id);
select t1.*, t2.*, t3.*
from t1
right join t2 ON t1.id = t2.id
right join t3 ON t3.id = f1(coalesce(t1.id));
 
drop function f1;
drop tables t1,t2,t3,t4;

Comment by Oleksandr Byelkin [ 2015-06-09 ]

in SP select list turned some how to this
(gdb) p dbug_print_item(val_item)
$3 = 0x1569cc0 <dbug_item_print_buf> "min(NULL)"
(gdb)

was:
(gdb) p dbug_print_item(val_item)
$2 = 0x1569cc0 <dbug_item_print_buf> "min(`test`.`t4`.`id`)"

probably equality this equality worked in such strange way somehow:
WHERE:(after remove) 0x7fffe00369d8 multiple equal(var1@0, `test`.`t4`.`id`)

Comment by Oleksandr Byelkin [ 2015-06-09 ]

This looks suspicious taking into account that t4 is not constant:
T@5 : | | | | | | | | | | | | | | | | | | | | | error: Error: no matching row in const table
T@5 : | | | | | | | | | | | | | | | | | | | | <JOIN::optimize

Comment by Oleksandr Byelkin [ 2015-06-09 ]

"correct" execution somehow avoid make_join_statistics() where one table referenced by a key become "const table" (it looks like quite correct, incorrect is the variable (key) value).

Comment by Oleksandr Byelkin [ 2015-06-09 ]

opt_sum_query() is for blame

Comment by Oleksandr Byelkin [ 2015-06-09 ]

In first statement (3rd call in it) we somehow do not have NULL values for the key lookup

Comment by Oleksandr Byelkin [ 2017-11-02 ]

The problem was fixed by one of this commits:

There are only 'skip'ped commits left to test.
The first bad commit could be any of:
46977e0a01ae2118df08c3cccaf2f0eaaabd2715
cba2ac6ef10304047f7a7941e6c8912573ebc3c6
59fca5806af65c8379a993f9e604cb0b20a76e2b
Keine binäre Suche mehr möglich!

Generated at Thu Feb 08 07:09:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.