Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-6123

Improve performance of rewritten DISTINCT queries

    XMLWordPrintable

Details

    • New Feature
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • None
    • MDB Plugin
    • None

    Description

      As for now, DISTINCT queries are rewritten using rule-based optimizer in the following way:

      SELECT DISTINCT col1, col2, ..., colN
      FROM <from>
      WHERE <where clause>
      HAVING <having clause>
      ORDER BY ordCol1, ordCol2, ..., ordColN
      

      becomes

      SELECT col1, col2, ..., colN
      FROM
        (
          SELECT DISTINCT col1, col2, ..., colN, ordCol1, ordCol2, ..., ordColN
          FROM <from>
          WHERE <where clause>
          HAVING <having clause>
        ) $added_sub_0
      GROUP BY 1, 2, ..., N
      ORDER BY `$added_sub_o`.ordCol1, ..., `$added_sub_0`.ordColN
      

      All ORDER BY columns of the original query that are not in the result set are added to resultset, so we can to use it in the ORDER BY of the outer aggregate query.
      Unfortunately, this means that every ORDER BY column will be calculated for every row in the inner query, which is suboptimal.

      Looks like a proper way to handle ORDER BY columns is to put all simple columns involved in the ORDER BY expressions into the result set of the inner query, and to use a full functional/arithmetic expression in the ORDER BY of the outer query. In this way, expressions will be calculated only once per aggregated row. This also applies to the HAVING clause that should be moved to the outer query.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              alexey.antipovsky Alexey Antipovsky
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.