Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
11.4.0, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
-
Linux
Description
I compared the "explain" output of two select count(*) from table, identical tables, one mariadb and the other "mysql 8.3", and they are different. The tables have one single field, varchar(10) and it's the primary key. I need to understand the difference:
Mariadb:
explain extended select count(*) from goodnumber force index(primary);
|
|
+------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-------+
|
| 1 | SIMPLE | goodnumber | ALL | NULL | NULL | NULL | NULL | 315011080 | 100.00 | |
|
+------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-------+,
|
Mysql, explain select count(*) from goodnumber;
|
+----+-------------+------------+------------+-------+---------------+---------+---------+------+-----------+----------+-------------+
|
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|
+----+-------------+------------+------------+-------+---------------+---------+---------+------+-----------+----------+-------------+
|
| 1 | SIMPLE | goodnumber | NULL | index | NULL | PRIMARY | 42 | NULL | 315054340 | 100.00 | Using index |
|
+----+-------------+------------+------------+-------+---------------+---------+---------+------+-----------+----------+-------------+.
|
The fastest result is the latter, like 10 times. Does it mean that the commercial version of MySQL by Oracle has a better optimizer and it counts the records on a table about 10 times faster?
Attachments
Issue Links
- relates to
-
MDEV-5004 Support parallel read transactions on the same snapshot
-
- Open
-
-
MDEV-6096 Ideas about parallel query execution
-
- Open
-
-
MDEV-34125 ANALYZE FORMAT=JSON: r_engine_stats.pages_read_time_ms has wrong scale
-
- Closed
-
I wonder if we should default to using the PRIMARY KEY or clustered index for COUNT( * ) queries on InnoDB tables. If a secondary index is chosen where there is a lot of history available (say, a column that is covered by that index has been frequently updated, or there have been massive DELETE in the table), then the MVCC access to the secondary index records could be extremely slow until MDEV-17598 has been implemented. I covered this in a presentation some time ago: https://mariadb.org/fest2022/how-innodb-multi-version-concurrency-control-mvcc-works/
Evaluating COUNT by a table scan can be more costly if the records are wide (only a few clustered index records per leaf page), but the worst-case performance should be rather limited, unless the read view is very old and many old versions of records need to be retrieved. In secondary indexes, the MVCC overhead may be paid for any read view. It is a kind of death spiral to performance, because as noted in
MDEV-33213, resolving the purge lag will require the MVCC checks to be executed in secondary indexes.