Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5149

using index for group-by slower than using index scan if many unique values in index

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Minor
    • Resolution: Unresolved
    • 5.5.33
    • 5.5(EOL)
    • None
    • CentOS 6.4

    Description

      MariaDB [_]> SHOW CREATE TABLE t1;
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Table | Create Table                                                                                                                                                     |
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | t1    | CREATE TABLE `t1` (
        `a` bigint(20) unsigned NOT NULL,
        `b` bigint(20) DEFAULT NULL,
        PRIMARY KEY (`a`),
        KEY `b` (`b`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [_]> SELECT COUNT(*), COUNT(DISTINCT b) FROM t1;
      +----------+-------------------+
      | COUNT(*) | COUNT(DISTINCT b) |
      +----------+-------------------+
      |  2267612 |           1974979 |
      +----------+-------------------+
      1 row in set (1.64 sec)
       
      MariaDB [_]> SELECT COUNT(DISTINCT b) FROM t1;
      +-------------------+
      | COUNT(DISTINCT b) |
      +-------------------+
      |           1974979 |
      +-------------------+
      1 row in set (3.73 sec)
       
      MariaDB [_]> EXPLAIN SELECT COUNT(DISTINCT b) FROM t1;
      +------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra                    |
      +------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
      |    1 | SIMPLE      | t1    | range | NULL          | b    | 9       | NULL | 2267951 | Using index for group-by |
      +------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [_]> SELECT COUNT(DISTINCT b+1) FROM t1;
      +---------------------+
      | COUNT(DISTINCT b+1) |
      +---------------------+
      |             1974979 |
      +---------------------+
      1 row in set (1.68 sec)
       
      MariaDB [_]> EXPLAIN SELECT COUNT(DISTINCT b+1) FROM t1;
      +------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |
      +------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
      |    1 | SIMPLE      | t1    | index | NULL          | b    | 9       | NULL | 2267950 | Using index |
      +------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
      1 row in set (0.00 sec)

      If the number of unique values is high, it seems a full index scan is faster than one using group-by.
      Maybe if the cardinality is high, using index for group-by shouldn't be used if the cpu cost is high to do the looser scan.

      Attachments

        Activity

          People

            Unassigned Unassigned
            flupps Tobias Asplund
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.