[MDEV-8913] Derived queries with same column names as final projection causes issues when using Order By Created: 2015-10-07  Updated: 2015-10-30  Resolved: 2015-10-30

Status: Closed
Project: MariaDB Server
Component/s: Data Manipulation - Subquery, Optimizer
Affects Version/s: 10.0.21, 10.1.7, 5.3.13, 5.5, 10.0, 10.1
Fix Version/s: 10.0.23, 10.1.9

Type: Bug Priority: Major
Reporter: Ryan Griffith Assignee: Oleksandr Byelkin
Resolution: Fixed Votes: 0
Labels: derived, wrong_result

Attachments: JPEG File issue_example.jpg    
Sprint: 10.0.22, 10.1.9-1

 Description   

DDL

CREATE TABLE `fact_web_session_summary` (
	`id` INT(11) NOT NULL AUTO_INCREMENT,
	`date_id` INT(11) NOT NULL,
	`web_source_id` INT(11) NOT NULL,
	`page_views` INT(11) NOT NULL,
	`hits` INT(11) NOT NULL,
	`users` INT(11) NOT NULL,
	`new_users` INT(11) NOT NULL,
	`sessions` INT(11) NOT NULL,
	`bounces` INT(11) NOT NULL,
	`duration` INT(11) NOT NULL
	PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB;
 
CREATE TABLE `dim_web_source` (
	`id` INT(11) NOT NULL AUTO_INCREMENT,
	`source` VARCHAR(200) NOT NULL,
	`medium` VARCHAR(200) NOT NULL,
	PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
 
CREATE TABLE `dim_date` (
  `date_id` int(11) NOT NULL,
  `date` date DEFAULT NULL,
  PRIMARY KEY (`date_id`),
  UNIQUE KEY `date` (`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
 

SQL Statement to Issue

SELECT
	`Project1`.`C7` AS `C1`, 
	`Project1`.`source`, 
	`Project1`.`C1` AS `C2`, 
	`Project1`.`C2` AS `C3`, 
	`Project1`.`C3` AS `C4`, 
	`Project1`.`C4` AS `C5`, 
	`Project1`.`C5` AS `C6`, 
	`Project1`.`C6` AS `C7`
FROM (
	SELECT
		`GroupBy1`.`A1` AS `C1`, 
		`GroupBy1`.`A2` AS `C2`, 
		`GroupBy1`.`A3` AS `C3`, 
		`GroupBy1`.`A4` AS `C4`, 
		`GroupBy1`.`A5` AS `C5`, 
		`GroupBy1`.`A6` AS `C6`, 
		`GroupBy1`.`K1` AS `source`, 
		1 AS `C7`
	FROM (
		SELECT
			`Extent3`.`source` AS `K1`, 
			SUM(`Extent1`.`sessions`) AS `A1`, 
			SUM(`Extent1`.`new_users`) AS `A2`, 
			SUM(`Extent1`.`page_views`) AS `A3`, 
			SUM(`Extent1`.`bounces`) AS `A4`, 
			SUM(`Extent1`.`duration`) AS `A5`, 
			SUM(`Extent1`.`hits`) AS `A6`
		FROM `fact_web_session_summary` AS `Extent1` 
		INNER JOIN `dim_date` AS `Extent2` ON `Extent1`.`date_id` = `Extent2`.`date_id` 
		INNER JOIN `dim_web_source` AS `Extent3` ON `Extent1`.`web_source_id` = `Extent3`.`id`
		WHERE
			`Extent2`.`date` >= '2015-01-01' AND
            `Extent2`.`date` <= '2015-10-01'
		GROUP BY `Extent3`.`source`
		) AS `GroupBy1`
        
    ) AS `Project1`
 ORDER BY `Project1`.`C1` DESC
 LIMIT 10

Synopsis

The final result ends up Ordering by the ALIASED `C1`, _NOT _`Project1`.`C1` due to both the derived table and the final alias having the exact same name despite the ORDER BY being fully qualified.

If you change the final alias in the outermost query from `C1` to anything else, such as `ABC`, there is no longer a shared column name between the derived table and the outer queries alias, and so the Order By then takes affect as it should



 Comments   
Comment by Elena Stepanova [ 2015-10-19 ]

rgriffith, thanks for the report.
As a workaround, you can set 'optimizer_switch=derived_merge=off' (or just use unique column aliases, as you said – it will make the query much more readable anyway).


Smaller test case:

create table t1 (field int);
insert into t1 values (10),(5),(3),(8),(20);
 
SELECT
  sq.f2 AS f1,
  sq.f1 AS f2
FROM (
  SELECT
    field AS f1, 
    1 AS f2
  FROM t1
) AS sq
ORDER BY sq.f1;
drop table t1;

Actual result

f1	f2
1	10
1	5
1	3
1	8
1	20

Expected result

f1	f2
1	3
1	5
1	8
1	10
1	20

Reproducible on MariaDB 5.3 and higher, and also on MySQL 5.7.
Not reproducible on MariaDB 5.2 and MySQL 5.5/5.6.

Comment by Ryan Griffith [ 2015-10-19 ]

Hi Elena,

Thank you for your comments. Unfortunately, I'm using an ORM (.Net - Entity Framework / MySQL Connector) which generates these statements. I do not have any control over the output. Additionally, the sole reason I moved to MariaDB is for better performance due of derived tables due to the significant use of them in the generated SQL from the Entity Framework / MySQL connector. Turning off derived_merge would invalidate the very reason I moved to MariaDB. I'm stuck at this point unless MariaDB is updated or the MySQL connector is updated to use different column names.

Comment by Elena Stepanova [ 2015-10-19 ]

I see. Hopefully it will be fixed soon. I assume you are on 10.0 now?

Comment by Ryan Griffith [ 2015-10-19 ]

I am on 10.1.7 right now.

Thank you!

Comment by Ryan Griffith [ 2015-10-19 ]

I am setup as a Oracle contributor. Given that this is exhibited in MySQL 5.7, should I report the bug there as well?

Comment by Elena Stepanova [ 2015-10-19 ]

Sure, please do report it, it's our normal practice to inform upstream about bugs that affect MySQL as well, even though in case of optimizer bugs we rarely use upstream fixes.

Comment by Oleksandr Byelkin [ 2015-10-20 ]

The problem is that in the SELECT list we overwrite Item name according to AS clause but tablename is left as it was, so ORDER BY really thinks it resolve correctly (and it does according to new names) but of course it is wrong...

Comment by Oleksandr Byelkin [ 2015-10-20 ]

TODO: check VIEW and normal tables

Comment by Oleksandr Byelkin [ 2015-10-27 ]

It is perfectly repeatable even with tables (and views):

create table t1 (field int);
insert into t1 values (10),(5),(3),(8),(20);
 
SELECT
  sq.f2 AS f1,
  sq.f1 AS f2
FROM (
  SELECT
    field AS f1, 
    1 AS f2
  FROM t1
) AS sq
ORDER BY sq.f1;
 
 
create view v1 as SELECT field AS f1, 1 AS f2 FROM t1;
 
SELECT
  sq.f2 AS f1,
  sq.f1 AS f2
FROM v1 AS sq
ORDER BY sq.f1;
 
drop view v1;
 
create table t2 SELECT field AS f1, 1 AS f2 FROM t1;
 
SELECT
  sq.f2 AS f1,
  sq.f1 AS f2
FROM t2 AS sq
ORDER BY sq.f1;
 
drop table t1, t2;

Comment by Oleksandr Byelkin [ 2015-10-27 ]

5.7 has the same problem.

Comment by Oleksandr Byelkin [ 2015-10-27 ]

revision-id: 772f7914db4fc6b293e438ad1c09dac59a239c5a (mariadb-10.0.21-41-g772f791)
parent(s): 9a3ff0789fbf6c0ea627415c90ae5487449f433b
committer: Oleksandr Byelkin
timestamp: 2015-10-27 11:17:52 +0100
message:

MDEV-8913 Derived queries with same column names as final projection causes issues when using Order By

find_item_in_list() now recognize view fields like a fields even if they rever to an expression.

Comment by Ryan Griffith [ 2015-10-27 ]

Thank you very much for your work on this. Where can I see the commit? I checked Github but did not see a branch that seemed like it would be the one I should look at. I'm curious as to the fix.

Thanks!

Comment by Oleksandr Byelkin [ 2015-10-27 ]

For now you can see it only in e-mail list ( http://lists.askmonty.org/pipermail/commits/2015-October/008536.html ). it will be pushed on github after code review.

Comment by Alexander Barkov [ 2015-10-29 ]

As discussed on IRC Sanja will additionally check why this query does not return errors:

SELECT 1 FROM (SELECT 1 as a) AS b HAVING (SELECT `SOME_GARBAGE`.b.a)=1; 

Comment by Oleksandr Byelkin [ 2015-10-29 ]

revision-id: 34f126ee9b7da4f277da4ddd39d5d927c4bce9db (mariadb-10.0.21-41-g34f126e)
parent(s): 9a3ff0789fbf6c0ea627415c90ae5487449f433b
committer: Oleksandr Byelkin
timestamp: 2015-10-29 15:56:51 +0100
message:

MDEV-8913 Derived queries with same column names as final projection causes issues when using Order By

find_item_in_list() now recognize view fields like a fields even if they rever to an expression.
The problem of schema name do not taken into account for field with it and
derived table fixed.
Duplicating code removed

Comment by Alexander Barkov [ 2015-10-30 ]

This patch looks Ok to push:
http://lists.askmonty.org/pipermail/commits/2015-October/008556.html
Thanks.

Generated at Thu Feb 08 07:30:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.