[MDEV-4166] Hash based aggregation for large input, small output GROUP BY - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Fix Version/s: None
Component/s: None
Labels:
None

Description

The current sort based GROUP BY implementation performs badly when source data is large (much larger than available RAM) and output is small. An alternative method, supported by Oracle, PostgreSQL, SQL Server, etc. is to use hash aggregation, as described here: http://blogs.msdn.com/b/craigfr/archive/2006/09/20/hash-aggregate.aspx

Because the optimizer may not have sufficient stats to choose hash vs. sort aggregation, an optimization hint should be available for use in SELECT statements.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Mathew Johnston

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Due:: 2013-02-26

Created:: 2013-02-12 03:35

Updated:: 2013-09-29 17:45

Resolved:: 2013-09-29 17:45

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server