[MDEV-29865] Optimizer chooses wrong path for big results - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.6.9, 10.10.1
Fix Version/s: None
Component/s: Optimizer
Labels:
Environment:
Linux, n.a.

Description

Optimizer chooses wrong way to execute GROUP BY for big results. The difference is dramatic.

The documentation states:

"The optimizer automatically knows if the result is too big, but you can force the optimizer to use a temporary table with SQL_SMALL_RESULT, or avoid the temporary table using SQL_BIG_RESULT."

SQL> SELECT SQL_BIG_RESULT data, id, ts, COUNT(*) FROM test GROUP BY data, id, ts;

SQL> SELECT SQL_SMALL_RESULT /* which is the default */ data, id, ts, COUNT(*) FROM test GROUP BY data, id, ts;

10.6.9

# Query_time: 1.503094  Lock_time: 0.000179  Rows_sent: 1048578  Rows_examined: 2097156

# Rows_affected: 0  Bytes_sent: 64202110

SELECT SQL_BIG_RESULT data, id, ts, COUNT(*) FROM test GROUP BY data, id, ts;

# Query_time: 14.653313  Lock_time: 0.000131  Rows_sent: 1048578  Rows_examined: 3145734

# Rows_affected: 0  Bytes_sent: 64202110

SELECT data, id, ts, COUNT(*) FROM test GROUP BY data, id, ts;

10.10.1

# Query_time: 1.535291  Lock_time: 0.000150  Rows_sent: 1048578  Rows_examined: 2097156

# Rows_affected: 0  Bytes_sent: 64202110

# Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No

# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 7  Priority_queue: No

select sql_big_result data, id, ts, count(*) from test group by data, id, ts;

# Query_time: 14.631351  Lock_time: 0.000150  Rows_sent: 1048578  Rows_examined: 3145734

# Rows_affected: 0  Bytes_sent: 64202110

# Tmp_tables: 2  Tmp_disk_tables: 1  Tmp_table_sizes: 106225664

# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: Yes

# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 8  Priority_queue: No

select data, id, ts, count(*) from test group by data, id, ts;

It looks like the Optimizer is doing 50% more work in the default case...

Test set is attached.

For small results it seems OK:

SQL> select sql_small_result data, ts, count(*) from test group by data, ts;

12 rows in set (0.483 sec)

SQL> select sql_big_result data, ts, count(*) from test group by data, ts;

12 rows in set (1.159 sec)

I see some huge potential here for customers with huge GROUP BY...

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

test_dump.sql.gz
2.52 MB
2022-10-24 15:38

Activity

People

Assignee:: Unassigned

Reporter:: Oli Sennhauser

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2022-10-24 15:35

Updated:: 2022-10-26 19:56

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.