A testcase showing good query plan after this fix and bad query plan before:
Based on the MTR testcase for this MDEV, but add table t2:
create table t1 (
|
pk int,
|
key1 int,
|
key2 int,
|
filler char(100),
|
index (key1, pk),
|
index(key2),
|
primary key (pk)
|
);
|
|
insert into t1
|
select
|
seq, FLOOR(seq/100), seq, 'filler'
|
from
|
seq_1_to_1000;
|
analyze table t1;
|
|
create table t2 (a int, b int, key(a));
|
insert into t2 select mod(seq, 1000), seq from seq_1_to_100000;
|
analyze table t2;
|
After the fix:
MariaDB [test2]> explain select * from t1, t2 where pk in (1,2,3,4,5) and key1 <= 4 and t2.a=t1.key2;
|
+------+-------------+-------+-------+-------------------+---------+---------+---------------+------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+-------+-------------------+---------+---------+---------------+------+-------------+
|
| 1 | SIMPLE | t1 | range | PRIMARY,key1,key2 | PRIMARY | 4 | NULL | 5 | Using where |
|
| 1 | SIMPLE | t2 | ref | a | a | 5 | test2.t1.key2 | 51 | |
|
+------+-------------+-------+-------+-------------------+---------+---------+---------------+------+-------------+
|
2 rows in set (0.002 sec)
|
This will read 5 + 5 *51 = 260 rows.
Try with fix disabled:
MariaDB [test2]> set optimizer_adjust_secondary_key_costs='';
|
Query OK, 0 rows affected (0.001 sec)
|
|
MariaDB [test2]> explain select * from t1, t2 where pk in (1,2,3,4,5) and key1 <= 4 and t2.a=t1.key2;
|
+------+-------------+-------+------------+-------------------+-----------+---------+------------+---------+--------------------------------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------------+-------------------+-----------+---------+------------+---------+--------------------------------------------------------+
|
| 1 | SIMPLE | t2 | ALL | a | NULL | NULL | NULL | 100113 | Using where |
|
| 1 | SIMPLE | t1 | ref|filter | PRIMARY,key1,key2 | key2|key1 | 5|9 | test2.t2.a | 1 (40%) | Using index condition; Using where; Using rowid filter |
|
+------+-------------+-------+------------+-------------------+-----------+---------+------------+---------+--------------------------------------------------------+
|
2 rows in set (0.003 sec)
|
100K rows for table t2 + 100K lookups into t1 gives ~ 200K rows read.
commit ccf1510ae6aa369fd2e6e588f2a946f8085cd482 (HEAD -> bb-10.5-mdev34993, origin/bb-10.5-mdev34993)
Author: Sergei Petrunia <sergey@mariadb.com>
Date: Tue Sep 24 10:23:34 2024 +0300
MDEV-34993, part2: backport optimizer_adjust_secondary_key_costs
And make the fix for MDEV-34993 switchable.
commit 19e654dec144bc00cc5b4d60892e209688437036
Author: Sergei Petrunia <sergey@mariadb.com>
Date: Tue Sep 24 09:44:39 2024 +0300
MDEV-34993: Incorrect cardinality estimation causes poor query plan
When calculate_cond_selectivity_for_table() takes into account multi-
column selectivities from range access, it tries to take-into account
that selectivity for some columns may have been already taken into account.
For example, for range access on IDX1 using {kp1, kp2}, the selectivity
of restrictions on "kp2" might have already been taken into account
to some extent.
So, the code tries to "discount" that using rec_per_key[] estimates.
This seems to be wrong and unreliable: the "discounting" may produce a
rselectivity_multiplier number that hints that the overall selectivity
of range access on IDX1 was greater than 1.
Do a conservative fix: if we arrive at conclusion that selectivity of
range access on condition in IDX1 >1.0, clip it down to 1.