Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25926

Avoid full table scan in subqueries with GROUP BY and window functions on indexed columns

    XMLWordPrintable

Details

    Description

      Putting conditions on the result of window functions requires nested queries. This unfortunately always causes a full table scan. But when the window function is applied to a properly indexed or ordered data set, it would be theoretically possible abort the scan under some conditions. The same applies to GROUP BY in subqueries.

      CREATE TABLE `test` (
       `uid` int(11) NOT NULL AUTO_INCREMENT,
       `wert` int(11) NOT NULL,
       PRIMARY KEY (`uid`)
      );
      

      INSERT INTO test (wert) VALUES (5),(4),(3),(2),(1)
      

      ANALYZE SELECT uid FROM test GROUP BY uid LIMIT 3 
      

      Here we obtain r_rows=3 because the GROUP is applied to an indexed column. The table scan can be aborted as soon as the LIMIT of 3 is reached.

      But such optimizations do not occur on subqueries.

      ANALYZE SELECT * FROM
      (
          SELECT uid FROM test GROUP BY uid
      ) tabelle LIMIT 3
      

      Result: r_rows=5.

      The same hold true for window functions. In case of window functions, such nested queries are highly relevant, because conditions on the result of window functions can only be applied in an outer query.

      ANALYZE SELECT * FROM
      (
          SELECT uid,DENSE_RANK() OVER (ORDER BY uid) AS rank FROM test
      ) tabelle WHERE rank<=3
      

      Result: r_rows=5. This result is the same when we replace "rank<=3" by "LIMIT 3".

      The lack of optimization can cause low performance in applications such as pagination where the ranked/paginated field is not unique. E.g. I have a list of values 1,1,2,2,...,2,3,3,...,3,3,3,4,5,... and want to fetch the top rows until having n unique numbers and then abort the table scan.

      Attachments

        Activity

          People

            Unassigned Unassigned
            abdab Andreas Becker
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.