[MDEV-4119] Remove duplicate ORDER BY/GROUP BY column references - Jira

Details

Type: Task
Status: Stalled (View Workflow)
Priority: Minor
Resolution: Unresolved
Fix Version/s: 10.1(EOL)
Component/s: None
Labels:
- optimizer

Sprint:
10.1.6-2, 10.1.8-1

Description

Currently the optimizer doesn't cannot take into account duplicate column references in the GROUP/ORDER BY clauses. As a result, queries like:

select a col1, a col2 from t1 group by col1, col2;

will not use an index on t1(a), even though the duplicate columns
do not change the meaning of the query.

This is a simple task to remove duplicates in the order/group lists.
Care should be taken of queries with the ROLLUP modifier.

There is already an implementation attached to this task. The implementation has to be verified/changed so that:

it takes care of queries with ROLLUP,
it is verified that the place where duplicate removal is done covers all cases when it can and should be done,
there is sufficient coverage with test cases.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

md641.diff
10 kB
2013-01-31 11:26

Activity

Ascending order - Click to sort in descending order

Michael Widenius added a comment - 2015-06-29 12:44 - edited

I added deleting of duplicate fields in GROUP BY and ORDER BY in 2014, so this part of the patch is irrelevant.

I did however notice that removing duplicates and constants was done too late for some optimizations to have effects, which resulted in not optimal query plans.

Those that was affected was:

SELECT SQL_BIG_RESULT col1 AS field1, col1 AS field2 FROM t1 GROUP BY field1, field2

This did an extra filesort

select col1 f1, col1 f2 from t1 group by f1, f2
select col1 f1, col1 f2 from t1 group by f1, 1+1;

These did not do 'Using index for group by'

Will now work on what can be easily fixed

Michael Widenius added a comment - 2015-06-29 12:44 - edited I added deleting of duplicate fields in GROUP BY and ORDER BY in 2014, so this part of the patch is irrelevant. I did however notice that removing duplicates and constants was done too late for some optimizations to have effects, which resulted in not optimal query plans. Those that was affected was: SELECT SQL_BIG_RESULT col1 AS field1, col1 AS field2 FROM t1 GROUP BY field1, field2 This did an extra filesort select col1 f1, col1 f2 from t1 group by f1, f2 select col1 f1, col1 f2 from t1 group by f1, 1+1; These did not do 'Using index for group by' Will now work on what can be easily fixed

Michael Widenius added a comment - 2015-07-02 17:18

Have now fixed the following problems:

SELECT on one table with group by will now remove duplicated fields and constants from GROUP BY before doing other optimizations.
Added better test if we are using filesort or not for EXPLAIN

At the same time I also fixed a bug when we where referring to a calculated field in HAVING:

Example:
select a, round(rand(100)*10) r2, sum(1) r1 from t1 group by a having r1>1 and r2<=2;

In some case, in the old code, r2 in having was calculated independent of r2 in select, which cause the select to display rows with r2 > 2.

Michael Widenius added a comment - 2015-07-02 17:18 Have now fixed the following problems: SELECT on one table with group by will now remove duplicated fields and constants from GROUP BY before doing other optimizations. Added better test if we are using filesort or not for EXPLAIN At the same time I also fixed a bug when we where referring to a calculated field in HAVING: Example: select a, round(rand(100)*10) r2, sum(1) r1 from t1 group by a having r1>1 and r2<=2; In some case, in the old code, r2 in having was calculated independent of r2 in select, which cause the select to display rows with r2 > 2.

Julien Fritsch added a comment - 2023-12-05 09:26

Automated message:
----------------------------
Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

Julien Fritsch added a comment - 2023-12-05 09:26 Automated message: ---------------------------- Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

People

Assignee:: Michael Widenius

Reporter:: Timour Katchaounov (Inactive)

Votes:: 2 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2013-01-31 11:26

Updated:: 2023-12-05 20:42

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Activity

People

Dates

Git Integration