Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29865

Optimizer chooses wrong path for big results

    XMLWordPrintable

Details

    Description

      Optimizer chooses wrong way to execute GROUP BY for big results. The difference is dramatic.

      The documentation states:

      "The optimizer automatically knows if the result is too big, but you can force the optimizer to use a temporary table with SQL_SMALL_RESULT, or avoid the temporary table using SQL_BIG_RESULT."

      SQL> SELECT SQL_BIG_RESULT data, id, ts, COUNT(*) FROM test GROUP BY data, id, ts;
      SQL> SELECT SQL_SMALL_RESULT /* which is the default */ data, id, ts, COUNT(*) FROM test GROUP BY data, id, ts;
      

      10.6.9

      # Query_time: 1.503094  Lock_time: 0.000179  Rows_sent: 1048578  Rows_examined: 2097156
      # Rows_affected: 0  Bytes_sent: 64202110
      SELECT SQL_BIG_RESULT data, id, ts, COUNT(*) FROM test GROUP BY data, id, ts;
       
      # Query_time: 14.653313  Lock_time: 0.000131  Rows_sent: 1048578  Rows_examined: 3145734
      # Rows_affected: 0  Bytes_sent: 64202110
      SELECT data, id, ts, COUNT(*) FROM test GROUP BY data, id, ts;
      

      10.10.1

      # Query_time: 1.535291  Lock_time: 0.000150  Rows_sent: 1048578  Rows_examined: 2097156
      # Rows_affected: 0  Bytes_sent: 64202110
      # Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
      # Filesort: Yes  Filesort_on_disk: No  Merge_passes: 7  Priority_queue: No
      select sql_big_result data, id, ts, count(*) from test group by data, id, ts;
       
      # Query_time: 14.631351  Lock_time: 0.000150  Rows_sent: 1048578  Rows_examined: 3145734
      # Rows_affected: 0  Bytes_sent: 64202110
      # Tmp_tables: 2  Tmp_disk_tables: 1  Tmp_table_sizes: 106225664
      # Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: Yes
      # Filesort: Yes  Filesort_on_disk: No  Merge_passes: 8  Priority_queue: No
      select data, id, ts, count(*) from test group by data, id, ts;
      

      It looks like the Optimizer is doing 50% more work in the default case...

      Test set is attached.

      For small results it seems OK:

      SQL> select sql_small_result data, ts, count(*) from test group by data, ts;
      12 rows in set (0.483 sec)
       
      SQL> select sql_big_result data, ts, count(*) from test group by data, ts;
      12 rows in set (1.159 sec)
      

      I see some huge potential here for customers with huge GROUP BY...

      Attachments

        Activity

          People

            Unassigned Unassigned
            oli Oli Sennhauser
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.