[MDEV-38231] Cost model: secondary key index reads are too expensive? - Jira

XML

Word

Printable

The problem looked like this result change when the fix for ~~MDEV-38164~~ was enabled:

--- /optane/dev-git2/12.2-mdev-38164/mysql-test/main/order_by_optimizer_innodb.result   2025-11-24 14:07:50.125255016 +0200

+++ /optane/dev-git2/12.2-mdev-38164/mysql-test/main/order_by_optimizer_innodb.reject   2025-11-26 14:57:17.278181912 +0200

@@ -46,7 +46,7 @@

 # This also must use range, not ref. key_len must be 13

 EXPLAIN SELECT * FROM t2                       WHERE pk1=9 AND fd5 < 500 ORDER BY fd5 DESC LIMIT 10;

 id     select_type     table   type    possible_keys   key     key_len ref     rows    Extra

-1      SIMPLE  t2      range   PRIMARY,ux_pk1_fd5      ux_pk1_fd5      13      NULL    138     Using index condition

+1      SIMPLE  t2      range   PRIMARY,ux_pk1_fd5      ux_pk1_fd5      13      NULL    138     Using where

 drop table t2;

 # MDEV-6814: Server crashes in calculate_key_len on query with ORDER BY

The real issue was (quoting this comment ) as follows:

After the patch, the optimizer produces "cost": 0.15856389 for PK:

 index: "PRIMARY", rows=722,   "ranges": ["(9) <= (pk1) <= (9)"],

and a higher "cost": 0.21899416 for scanning fewer rows (actually a proper subset of the above) through a secondary key:

"index": "ux_pk1_fd5",  "rows": 138,  "ranges": ["(9,NULL) < (pk1,fd5) < (9,500)"]

trace:

 "range_scan_alternatives": [

      "index": "PRIMARY",

      "ranges": ["(9) <= (pk1) <= (9)"],

      "rowid_ordered": true,

      "using_mrr": false,

      "index_only": false,

      "rows": 722,

      "cost": 0.15856389,

      "chosen": true

},

      "index": "ux_pk1_fd5",

      "ranges": ["(9,NULL) < (pk1,fd5) < (9,500)"],

      "rowid_ordered": false,

      "using_mrr": false,

      "index_only": false,

      "rows": 138,

      "cost": 0.21899416,

      "chosen": false,

      "cause": "cost"

The problem was investigated in comments starting from this comment.

There are two issues:

ISSUE-1: The optimizer uses a wrong formula to count the number of datafile blocks that will be hit by an index scan.
ISSUE-2: The optimizer doesn't take into account that the ranges being scanned include a restriction on the primary key (which limits the range of datafile we will hit even further).

relates to

MDEV-38164 Poor cost calculations for index access cause bad query plans for big VARCHARs in 11.x

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.