[MDEV-5149] using index for group-by slower than using index scan if many unique values in index Created: 2013-10-17  Updated: 2014-06-06

Status: Open
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.33
Fix Version/s: 5.5

Type: Bug Priority: Minor
Reporter: Tobias Asplund Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: optimizer
Environment:

CentOS 6.4



 Description   

MariaDB [_]> SHOW CREATE TABLE t1;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                     |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `a` bigint(20) unsigned NOT NULL,
  `b` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`a`),
  KEY `b` (`b`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [_]> SELECT COUNT(*), COUNT(DISTINCT b) FROM t1;
+----------+-------------------+
| COUNT(*) | COUNT(DISTINCT b) |
+----------+-------------------+
|  2267612 |           1974979 |
+----------+-------------------+
1 row in set (1.64 sec)
 
MariaDB [_]> SELECT COUNT(DISTINCT b) FROM t1;
+-------------------+
| COUNT(DISTINCT b) |
+-------------------+
|           1974979 |
+-------------------+
1 row in set (3.73 sec)
 
MariaDB [_]> EXPLAIN SELECT COUNT(DISTINCT b) FROM t1;
+------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra                    |
+------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
|    1 | SIMPLE      | t1    | range | NULL          | b    | 9       | NULL | 2267951 | Using index for group-by |
+------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
1 row in set (0.00 sec)
 
MariaDB [_]> SELECT COUNT(DISTINCT b+1) FROM t1;
+---------------------+
| COUNT(DISTINCT b+1) |
+---------------------+
|             1974979 |
+---------------------+
1 row in set (1.68 sec)
 
MariaDB [_]> EXPLAIN SELECT COUNT(DISTINCT b+1) FROM t1;
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
|    1 | SIMPLE      | t1    | index | NULL          | b    | 9       | NULL | 2267950 | Using index |
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

If the number of unique values is high, it seems a full index scan is faster than one using group-by.
Maybe if the cardinality is high, using index for group-by shouldn't be used if the cpu cost is high to do the looser scan.



 Comments   
Comment by Elena Stepanova [ 2013-10-18 ]

Assigned to Timour (for analysis and commenting) arbitrarily, please reassign if needed.

Generated at Thu Feb 08 07:02:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.