[MDEV-25926] Avoid full table scan in subqueries with GROUP BY and window functions on indexed columns - Jira

XML

Word

Printable

Details

Type: Task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: None
Component/s: Data Manipulation - Subquery, Optimizer - Window functions
Labels:
None

Description

Putting conditions on the result of window functions requires nested queries. This unfortunately always causes a full table scan. But when the window function is applied to a properly indexed or ordered data set, it would be theoretically possible abort the scan under some conditions. The same applies to GROUP BY in subqueries.

CREATE TABLE `test` (

 `uid` int(11) NOT NULL AUTO_INCREMENT,

 `wert` int(11) NOT NULL,

 PRIMARY KEY (`uid`)

);

INSERT INTO test (wert) VALUES (5),(4),(3),(2),(1)

ANALYZE SELECT uid FROM test GROUP BY uid LIMIT 3

Here we obtain r_rows=3 because the GROUP is applied to an indexed column. The table scan can be aborted as soon as the LIMIT of 3 is reached.

But such optimizations do not occur on subqueries.

ANALYZE SELECT * FROM

    SELECT uid FROM test GROUP BY uid

) tabelle LIMIT 3

Result: r_rows=5.

The same hold true for window functions. In case of window functions, such nested queries are highly relevant, because conditions on the result of window functions can only be applied in an outer query.

ANALYZE SELECT * FROM

    SELECT uid,DENSE_RANK() OVER (ORDER BY uid) AS rank FROM test

) tabelle WHERE rank<=3

Result: r_rows=5. This result is the same when we replace "rank<=3" by "LIMIT 3".

The lack of optimization can cause low performance in applications such as pagination where the ranked/paginated field is not unique. E.g. I have a list of values 1,1,2,2,...,2,3,3,...,3,3,3,4,5,... and want to fetch the top rows until having n unique numbers and then abort the table scan.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Andreas Becker

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2021-06-15 10:27

Updated:: 2021-06-15 10:27

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.