[MDEV-25926] Avoid full table scan in subqueries with GROUP BY and window functions on indexed columns Created: 2021-06-15  Updated: 2021-06-15

Status: Open
Project: MariaDB Server
Component/s: Data Manipulation - Subquery, Optimizer - Window functions
Fix Version/s: None

Type: Task Priority: Major
Reporter: Andreas Becker Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Putting conditions on the result of window functions requires nested queries. This unfortunately always causes a full table scan. But when the window function is applied to a properly indexed or ordered data set, it would be theoretically possible abort the scan under some conditions. The same applies to GROUP BY in subqueries.

CREATE TABLE `test` (
 `uid` int(11) NOT NULL AUTO_INCREMENT,
 `wert` int(11) NOT NULL,
 PRIMARY KEY (`uid`)
);

INSERT INTO test (wert) VALUES (5),(4),(3),(2),(1)

ANALYZE SELECT uid FROM test GROUP BY uid LIMIT 3 

Here we obtain r_rows=3 because the GROUP is applied to an indexed column. The table scan can be aborted as soon as the LIMIT of 3 is reached.

But such optimizations do not occur on subqueries.

ANALYZE SELECT * FROM
(
    SELECT uid FROM test GROUP BY uid
) tabelle LIMIT 3

Result: r_rows=5.

The same hold true for window functions. In case of window functions, such nested queries are highly relevant, because conditions on the result of window functions can only be applied in an outer query.

ANALYZE SELECT * FROM
(
    SELECT uid,DENSE_RANK() OVER (ORDER BY uid) AS rank FROM test
) tabelle WHERE rank<=3

Result: r_rows=5. This result is the same when we replace "rank<=3" by "LIMIT 3".

The lack of optimization can cause low performance in applications such as pagination where the ranked/paginated field is not unique. E.g. I have a list of values 1,1,2,2,...,2,3,3,...,3,3,3,4,5,... and want to fetch the top rows until having n unique numbers and then abort the table scan.


Generated at Thu Feb 08 09:41:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.