Details
Description
The reason behind this change is that MariaDB 10.x does not take into
account that for engines like InnoDB, that scanning a primary key can be up
to 7x faster than scanning a secondary key + read the row data trough the
primary key.
optimizer-adjust_secondary_key_costs is added to provide 2 small
adjustments to the 10.x optimizer cost model. This can be used in the
case where the optimizer wrongly uses a secondary key instead of a
clustered primary key.
The different values for the new variable:
optimizer_adjust_secondary_key_costs=0 (default)
- No changes to current model
optimizer_adjust_secondary_key_costs=1
- Ensure that the cost of of secondary indexes has a cost of at
least 5x times the cost of a primary key.
This disables part of the worst_seek optimization described below.
optimizer_adjust_secondary_key_costs=2
- Disable "worst_seek optimization and adjust filter cost slightly
(add cost of 1 if filter is used).
Attachments
Issue Links
- relates to
-
MDEV-33306 Optimizer choosing incorrect index in 10.6, 10.5 but not in 10.4
-
- Closed
-
-
MDEV-33471 Optimizer: add sys var optimizer_adjust_secondary_key_costs
-
- Closed
-
-
MDEV-29226 Optimizer does not use the best index for range access
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.6.17 [ 29518 ] |
Assignee | Michael Widenius [ monty ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Description |
The reason behind this change is that MariaDB 10.x does not take into
account that for engines like InnoDB, that scanning a primary key can be up to 7x faster than scanning a secondary key + read the row data trough the primary key. optimizer-adjust_secondary_key_costs is added to provide 2 small adjustments to the 10.x optimizer cost model. This can be used in the case where the optimizer wrongly uses a secondary key instead of a clustered primary key. The different values for the new variable: optimizer_adjust_secondary_key_costs=0 - No changes to current model optimizer_adjust_secondary_key_costs=1 - Ensure that the cost of of secondary indexes has a cost of at least 5x times the cost of a primary key. This disables part of the worst_seek optimization described below. optimizer_adjust_secondary_key_costs=2 - Disable "worst_seek optimization and adjust filter cost slightly (add cost of 1 if filter is used). |
The reason behind this change is that MariaDB 10.x does not take into
account that for engines like InnoDB, that scanning a primary key can be up to 7x faster than scanning a secondary key + read the row data trough the primary key. optimizer-adjust_secondary_key_costs is added to provide 2 small adjustments to the 10.x optimizer cost model. This can be used in the case where the optimizer wrongly uses a secondary key instead of a clustered primary key. The different values for the new variable: optimizer_adjust_secondary_key_costs=0 (default) - No changes to current model optimizer_adjust_secondary_key_costs=1 - Ensure that the cost of of secondary indexes has a cost of at least 5x times the cost of a primary key. This disables part of the worst_seek optimization described below. optimizer_adjust_secondary_key_costs=2 - Disable "worst_seek optimization and adjust filter cost slightly (add cost of 1 if filter is used). |
Labels | not |
Fix Version/s | 10.6.17 [ 29518 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Resolution | Fixed [ 1 ] | |
Status | In Progress [ 3 ] | Closed [ 6 ] |
Fix Version/s | 10.11.7 [ 29519 ] | |
Fix Version/s | 11.0.5 [ 29520 ] | |
Fix Version/s | 11.1.4 [ 29024 ] | |
Fix Version/s | 11.2.3 [ 29521 ] |
Link |
This issue relates to |
Link |
This issue relates to |
Link |
This issue relates to |
Commit message:
MDEV-33118 optimizer_adjust_secondary_key_costs variable
optimizer-adjust_secondary_key_costs is added to provide 2 small
adjustments to the 10.x optimizer cost model. This can be used in the
case where the optimizer wrongly uses a secondary key instead of a
clustered primary key.
The reason behind this change is that MariaDB 10.x does not take into
account that for engines like InnoDB, that scanning a primary key can be up
to 7x faster than scanning a secondary key + read the row data trough the
primary key.
The different values for optimizer_adjust_secondary_key_costs are:
optimizer_adjust_secondary_key_costs=0
- No changes to current model
optimizer_adjust_secondary_key_costs=1
- Ensure that the cost of of secondary indexes has a cost of at
least 5x times the cost of a primary key.
This disables part of the worst_seek optimization described below.
optimizer_adjust_secondary_key_costs=2
- Disable "worst_seek optimization and adjust filter cost slightly
(add cost of 1 if filter is used).
The idea behind 'worst_seek optimization' is that we limit the
cost for all non clustered ref access to the least of:
- best-rows-by-range (or all rows in no range found) / 10
- scan-time-table (roughly number of file blocks to scan table) * 3
In addition we also do not try to use rowid_filter if number of rows
estimated for 'ref' access is less than the worst_seek limitation.
The idea is that worst_seek is trying to take into account that if
we do a lot of accesses through a key, this is likely to be cached.
However it only does this for secondary keys, and not for clustered
keys or index only reads.
The effect of the worst_seek are:
- In some cases 'ref' will have a much lower cost than range or using
a clustered key.
- Some possible rowid filters for secondary keys will be ignored.
When implementing optimizer_adjust_secondary_key_costs=2, I noticed
that there is a slightly different costs for how ref+filter and
range+filter are calculated. This caused a lot of range and
range+filter to change to ref+filter, which is not good as
range+filter provides the optimizer a better estimate of how many
accepted rows there will be in the result set.
Adding a extra small cost (1 seek) when using filter mitigated the
above problems in almost all cases.
This patch should not be applied to MariaDB 11.0 as worst_seeks is
removed in 11.0 and the cost calculation for clustered keys, secondary
keys, index scan and filter is more exact.
Test case changes for --optimizer-adjust_secondary_key_costs=1
(Fix secondary key costs to be 5 x of primary key):
- stat_tables_innodb:
- Complex change (probably ok as number of rows are really small)
- ref over 1 row changed to range over 10 rows with join buffer
- ref over 5 rows changed to eq_ref
- secondary ref over 1 row changed to ref of primary key over 4 rows
- Change of key to use longer key with index pushdown (a little
bit worse but not significant).
- Change to use secondary (1 row) -> primary (4 rows)
- rowid_filter_innodb:
- index_merge (2 rows) & ref (1) -> all (23 rows) -> primary eq_ref.
Test case changes for --optimizer-adjust_secondary_key_costs=2
(remove of worst_seeks & adjust filter cost):
- stat_tables_innodb:
- Join order change (probably ok as number of rows are really small)
- ref (5 rows) & ref(1 row) changed to range (10 rows & join buffer)
& eq_ref.
- selectivity_innodb:
- ref -> ref|filter (ok)
- rowid_filter_innodb:
- ref -> ref|filter (ok)
- range|filter (64 rows) changed to ref|filter (128 rows).
ok as ref|filter outputs wrong number of rows in explain.
- range, range_mrr_icp:
-ref (500 rows -> ALL (1000 rows) (ok)
- select_pkeycache, select, select_jcl6:
- ref|filter (2 rows) -> ref (2 rows) (ok)
- selectivity:
- ref -> ref_filter (ok)
- range:
- Change of 'filtered' but no stat or plan change (ok)
- selectivity:
- ref -> ref+filter (ok)
- Change of filtered but no plan change (ok)
- join_nested_jcl6:
- range -> ref|filter (ok as only 2 rows)
- subselect3, subselect3_jcl6:
- ref_or_null (4 rows) -> ALL (10 rows) (ok)
- Index_subquery (4 rows) -> ALL (10 rows) (ok)
- partition_mrr_myisam, partition_mrr_aria and partition_mrr_innodb:
- Uses ALL instead of REF for a key value that is the same for > 50%
of rows. (good)
order_by_innodb:
- range (200 rows) -> ref (20 rows)+filesort (ok)
- subselect_sj2_mat:
- One test changed. One ALL removed and replaced with eq_ref. Likely
to be better.
- join_cache:
- Changed ref over 60% of the rows to use hash join (ok)
- opt_tvc:
- Changed to use eq_ref instead of ref with plan change (probably ok)
- opt_trace:
- No worst/max seeks clipping (good).
- Almost double range_scan_time and index_scan_time (ok).
- rowid_filter:
- ref -> ref|filtered (ok)
- range|filter (77 rows) changed to ref|filter (151 rows). Proably
ok as ref|filter outputs wrong number of rows in explain.
Reviewer: Sergei Petrunia <sergey@mariadb.com>