Details
-
Task
-
Status: Stalled (View Workflow)
-
Minor
-
Resolution: Unresolved
-
None
-
10.1.6-2, 10.1.8-1
Description
Currently the optimizer doesn't cannot take into account duplicate column references in the GROUP/ORDER BY clauses. As a result, queries like:
select a col1, a col2 from t1 group by col1, col2; |
will not use an index on t1(a), even though the duplicate columns
do not change the meaning of the query.
This is a simple task to remove duplicates in the order/group lists.
Care should be taken of queries with the ROLLUP modifier.
There is already an implementation attached to this task. The implementation has to be verified/changed so that:
- it takes care of queries with ROLLUP,
- it is verified that the place where duplicate removal is done covers all cases when it can and should be done,
- there is sufficient coverage with test cases.
I added deleting of duplicate fields in GROUP BY and ORDER BY in 2014, so this part of the patch is irrelevant.
I did however notice that removing duplicates and constants was done too late for some optimizations to have effects, which resulted in not optimal query plans.
Those that was affected was:
SELECT SQL_BIG_RESULT col1 AS field1, col1 AS field2 FROM t1 GROUP BY field1, field2
select col1 f1, col1 f2 from t1 group by f1, f2
select col1 f1, col1 f2 from t1 group by f1, 1+1;
Will now work on what can be easily fixed