Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4166

Hash based aggregation for large input, small output GROUP BY

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • None

    Description

      The current sort based GROUP BY implementation performs badly when source data is large (much larger than available RAM) and output is small. An alternative method, supported by Oracle, PostgreSQL, SQL Server, etc. is to use hash aggregation, as described here: http://blogs.msdn.com/b/craigfr/archive/2006/09/20/hash-aggregate.aspx

      Because the optimizer may not have sufficient stats to choose hash vs. sort aggregation, an optimization hint should be available for use in SELECT statements.

      Attachments

        Activity

          People

            Unassigned Unassigned
            mjohnston Mathew Johnston
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.