Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38877

Unnecessary filesort on derived table materialization

    XMLWordPrintable

Details

    Description

      This was observed on TPC-H query Q18.

      An inherently grouped table gains an unrequired filesort, slowing down the query.

      setup

      create table t1 (
        groups_20 int NOT NULL,
        groups_20_2 int NOT NULL,
        b int,
        PRIMARY KEY (groups_20, groups_20_2)
      );
      insert into t1 select seq/1000, seq+1, seq from seq_1_to_20000;
       
      create table t2 (a int, b int, index(a));
      insert into t2 select seq, seq from seq_0_to_1000;
      analyze table t1, t2;
      

      MariaDB [test]> analyze select a, sum(b) from    (     select groups_20 from t1       group by groups_20       having count(*)  != 1000   ) DT    join t2 on a = groups_20 group by a;
      +------+-------------+------------+------+---------------+------+---------+-----------+-------+----------+----------+------------+----------------------------------------------+
      | id   | select_type | table      | type | possible_keys | key  | key_len | ref       | rows  | r_rows   | filtered | r_filtered | Extra                                        |
      +------+-------------+------------+------+---------------+------+---------+-----------+-------+----------+----------+------------+----------------------------------------------+
      |    1 | PRIMARY     | t2         | ALL  | a             | NULL | NULL    | NULL      | 1001  | 1001.00  |   100.00 |     100.00 | Using where; Using temporary; Using filesort |
      |    1 | PRIMARY     | <derived2> | ref  | key0          | key0 | 4       | test.t2.a | 1     | 0.00     |   100.00 |     100.00 |                                              |
      |    2 | DERIVED     | t1         | ALL  | PRIMARY       | NULL | NULL    | NULL      | 19735 | 20000.00 |   100.00 |     100.00 | Using temporary; Using filesort              |
      +------+-------------+------------+------+---------------+------+---------+-----------+-------+----------+----------+------------+----------------------------------------------+
      3 rows in set (0.917 sec)
      

      We see 2 filesorts, the 2nd one in our derived table.

      select groups_20 from t1       group by groups_20       having count(*)  != 1000
      

      but our derived table is inherently grouped by like this, we do not need to wrap the result set in a filesort.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Johnston Rex Johnston
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.