Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13744

Force index for group by is not always honored

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Not a Bug
    • 10.1(EOL), 10.2(EOL), 10.3(EOL)
    • N/A
    • Optimizer
    • None
    • tested on docker container

    Description

      MariaDB optimizer does not seem to honor the "force index for group by" index hint :

      To reproduce:

       
      drop table if exists t;
      create table t(id int auto_increment, ts timestamp not null, primary key(id,ts)) engine=InnoDB, stats_sample_pages = 20;
      insert into t(ts) values ('2017-01-01');
      insert into t(ts) values ('2017-01-01');
      replace into t(ts) select t1.ts from t t1, t t2, t t3, t t4, t t5, t t6, t t7, t t8, t t9, t t10, t t11, t t12, t t13, t t14, t t15;
      insert into t(id,ts) select distinct id, '2017-01-02' from t;
      select count(distinct id) from t;
      analyze table t;
      show indexes from t;
      explain select id, max(ts) from t group by id;
      explain select id, max(ts) from t force index for group by (primary) group by id;
      
      

      analyze format = json select id, max(ts) from t force index for group by (primary) group by

       
      {
        "query_block": {
      {
        "query_block": {
          "select_id": 1,
          "r_loops": 1,
          "r_total_time_ms": 28.534,
          "table": {
            "table_name": "t",
            "access_type": "index",
            "key": "PRIMARY",
            "key_length": "8",
            "used_key_parts": ["id", "ts"],
            "r_loops": 1,
            "rows": 66792,
            "r_rows": 65540,
            "r_total_time_ms": 9.2646,
            "filtered": 100,
            "r_filtered": 100,
            "using_index": true
          }
        }
      }
      
      

      Cardinality in show indexes is 66343 for column id and for column ts.

      select count(distinct id) from t; # --> 32770

      Related Oracle bug reports

      https://bugs.mysql.com/bug.php?id=60023
      https://bugs.mysql.com/bug.php?id=87670

      Attachments

        Activity

          MySQL 5.7 shows Using index for group-by with or without the hint:

          MySQL 5.7

          MySQL [test]> explain select id, max(ts) from t group by id;
          +----+-------------+-------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
          | id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows  | filtered | Extra                    |
          +----+-------------+-------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
          |  1 | SIMPLE      | t     | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 32178 |   100.00 | Using index for group-by |
          +----+-------------+-------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
          1 row in set, 1 warning (0.00 sec)
          

          elenst Elena Stepanova added a comment - MySQL 5.7 shows Using index for group-by with or without the hint: MySQL 5.7 MySQL [test]> explain select id, max (ts) from t group by id; + ----+-------------+-------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | + ----+-------------+-------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+ | 1 | SIMPLE | t | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 32178 | 100.00 | Using index for group - by | + ----+-------------+-------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+ 1 row in set , 1 warning (0.00 sec)

          It actually does use index for GROUP BY. Reading data through index "PRIMARY" allows to read one GROUP BY group after another, and the optimizer is taking advantage of that.

          Compare:

          mysql> explain select id, max(ts) from t group by id\G
          *************************** 1. row ***************************
                     id: 1
            select_type: SIMPLE
                  table: t
                   type: index
          possible_keys: NULL
                    key: PRIMARY
                key_len: 8
                    ref: NULL
                   rows: 63681
                  Extra: Using index
          1 row in set (0.00 sec)
          

          Changing the GROUP BY clause so the optimizer doesn't know the index is usable:

          mysql> explain select id, max(ts) from t group by id+1\G
          *************************** 1. row ***************************
                     id: 1
            select_type: SIMPLE
                  table: t
                   type: index
          possible_keys: NULL
                    key: PRIMARY
                key_len: 8
                    ref: NULL
                   rows: 63681
                  Extra: Using index; Using temporary; Using filesort
          1 row in set (0.00 sec)
          

          psergei Sergei Petrunia added a comment - It actually does use index for GROUP BY. Reading data through index " PRIMARY " allows to read one GROUP BY group after another, and the optimizer is taking advantage of that. Compare: mysql> explain select id, max(ts) from t group by id\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t type: index possible_keys: NULL key: PRIMARY key_len: 8 ref: NULL rows: 63681 Extra: Using index 1 row in set (0.00 sec) Changing the GROUP BY clause so the optimizer doesn't know the index is usable: mysql> explain select id, max(ts) from t group by id+1\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t type: index possible_keys: NULL key: PRIMARY key_len: 8 ref: NULL rows: 63681 Extra: Using index; Using temporary; Using filesort 1 row in set (0.00 sec)

          Maybe, the actual question of this bug report was, why is this query not using Loose Scan to resolve the GROUP BY. This is a valid question. I suspect the cost calculations, but let me check that.

          psergei Sergei Petrunia added a comment - Maybe, the actual question of this bug report was, why is this query not using Loose Scan to resolve the GROUP BY. This is a valid question. I suspect the cost calculations, but let me check that.

          The report says

          Cardinality in show indexes is 66343 for column id and for column ts.

          select count(distinct id) from t; # --> 32770

          Well, if there are 32K GROUP BY groups with an average of 2 (TWO) rows per group, Loose Scan will not provide any speedup.

          psergei Sergei Petrunia added a comment - The report says Cardinality in show indexes is 66343 for column id and for column ts. select count(distinct id) from t; # --> 32770 Well, if there are 32K GROUP BY groups with an average of 2 (TWO) rows per group, Loose Scan will not provide any speedup.

          And btw if I get the GROUP BY group larger, it will pick Loose Scan:

          create table t2 (id int auto_increment, ts timestamp not null, primary key(id,ts)) engine=innodb;
          set @a=0;
           
          mysql> insert into t2 select (@a:=@a+1), '2010-01-01' from t limit 1000;
          Query OK, 1000 rows affected (0.01 sec)
          Records: 1000  Duplicates: 0  Warnings: 0
           
          mysql> insert into t2 select t2.id, date_add('2012-01-01', interval SIXTY_FOUR.id day) from t2, (select id from t2 order by id limit 64) as SIXTY_FOUR;
          Query OK, 64000 rows affected (0.28 sec)
          Records: 64000  Duplicates: 0  Warnings: 0
           
          analyze table t2;
          

          mysql>  explain select id, max(ts) from t2 group by id;
          +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
          | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
          +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
          |    1 | SIMPLE      | t2    | range | NULL          | PRIMARY | 4       | NULL | 1732 | Using index for group-by |
          +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
          

          psergei Sergei Petrunia added a comment - And btw if I get the GROUP BY group larger, it will pick Loose Scan: create table t2 (id int auto_increment, ts timestamp not null, primary key(id,ts)) engine=innodb; set @a=0;   mysql> insert into t2 select (@a:=@a+1), '2010-01-01' from t limit 1000; Query OK, 1000 rows affected (0.01 sec) Records: 1000 Duplicates: 0 Warnings: 0   mysql> insert into t2 select t2.id, date_add('2012-01-01', interval SIXTY_FOUR.id day) from t2, (select id from t2 order by id limit 64) as SIXTY_FOUR; Query OK, 64000 rows affected (0.28 sec) Records: 64000 Duplicates: 0 Warnings: 0   analyze table t2; mysql> explain select id, max(ts) from t2 group by id; +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+ | 1 | SIMPLE | t2 | range | NULL | PRIMARY | 4 | NULL | 1732 | Using index for group-by | +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+

          Closing as Not a bug.
          Feel free to open if you still think something doesn't work right.
          (If you think we should have a hint to force Loose Scan, and actually have a use for it - please file a new MDEV with a feature request)

          psergei Sergei Petrunia added a comment - Closing as Not a bug. Feel free to open if you still think something doesn't work right. (If you think we should have a hint to force Loose Scan, and actually have a use for it - please file a new MDEV with a feature request)

          People

            psergei Sergei Petrunia
            Richard Richard Stracke
            Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.