[MDEV-20209] Using different index with same range gives different number of records Created: 2019-07-30  Updated: 2020-06-03  Resolved: 2020-06-03

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.1, 10.2, 10.3, 10.4, 10.5
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Varun Gupta (Inactive) Assignee: Varun Gupta (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-8306 Complete cost-based optimization for ... Stalled

 Description   

Dataset:

create table t0 (a int);
INSERT INTO t0 VALUES (0),(0),(0),(0),(2),(0),(0),(1),(1),(0);
 
CREATE TABLE t1 (
a int(11) DEFAULT NULL,
b int(11) DEFAULT NULL,
c int(11) DEFAULT NULL,
KEY a (a),
KEY a_b (a,b)
)engine=MYISAM;
insert into t1 select A.a , B.a, C.a,D.a from t0 A, t0 B, t0 C, t0 D;

The query is:
MariaDB [test]> explain select a,b from t1 where a=1 and c> 0 order by b;
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|    1 | SIMPLE      | t1    | ref  | a,a_b         | a_b  | 5       | const | 1654 | Using where |
+------+-------------+-------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.002 sec)

So we have 2 indexes here on which we can use ref-access,
1) key a
2) key a_b
both support ref access on condition a=1
Now when i see the output of the estimates for both the index with the range optimizer

"range_scan_alternatives": [
                      {
                        "index": "a",
                        "ranges": ["(1) <= (a) <= (1)"],
                        "rowid_ordered": true,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 2363,
                        "cost": 3015.4,
                        "chosen": false,
                        "cause": "cost"
                      },
                      {
                        "index": "a_b",
                        "ranges": ["(1) <= (a) <= (1)"],
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 1654,
                        "cost": 2128.1,
                        "chosen": false,
                        "cause": "cost"
                      }
                    ]

The estimates for key a and a_b are different while they have the same range.
The key a has expected rows 2363 while a_b has expected rows as 1654.
Isn't both the indexes expected to have the same estimates?

Actual count of records with a=1;

MariaDB [test]> select count(*) from t1 where a=1;
+----------+
| count(*) |
+----------+
|     2000 |
+----------+
1 row in set (0.018 sec)



 Comments   
Comment by Sergei Petrunia [ 2020-06-03 ]
  • This should not have category=optimizer, as the number is obtained from records_in_range() call.
  • records_in_range() is an estimate, so it is not necessarily precise. I think one can expect that error (i.e abs(estimate-real_value)) is different for different indexes.
  • In the above example, one estimate is 0.8*real_value, the other 1.18*real_value. The error is within reason.
Generated at Thu Feb 08 08:57:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.