[MDEV-25926] Avoid full table scan in subqueries with GROUP BY and window functions on indexed columns Created: 2021-06-15 Updated: 2021-06-15 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Data Manipulation - Subquery, Optimizer - Window functions |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major |
| Reporter: | Andreas Becker | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Description |
|
Putting conditions on the result of window functions requires nested queries. This unfortunately always causes a full table scan. But when the window function is applied to a properly indexed or ordered data set, it would be theoretically possible abort the scan under some conditions. The same applies to GROUP BY in subqueries.
Here we obtain r_rows=3 because the GROUP is applied to an indexed column. The table scan can be aborted as soon as the LIMIT of 3 is reached. But such optimizations do not occur on subqueries.
Result: r_rows=5. The same hold true for window functions. In case of window functions, such nested queries are highly relevant, because conditions on the result of window functions can only be applied in an outer query.
Result: r_rows=5. This result is the same when we replace "rank<=3" by "LIMIT 3". The lack of optimization can cause low performance in applications such as pagination where the ranked/paginated field is not unique. E.g. I have a list of values 1,1,2,2,...,2,3,3,...,3,3,3,4,5,... and want to fetch the top rows until having n unique numbers and then abort the table scan. |