[MDEV-35300] EXPLAIN shows inconsistent #rows for LIMIT queries - Jira

XML

Word

Printable

Details

Type: Bug
Status: Stalled (View Workflow)
Priority: Critical
Resolution: Unresolved
Affects Version/s: 10.6, 11.6(EOL)
Fix Version/s: 10.6
Component/s: Optimizer
Labels:
- order-by-optimization

Description

In EXPLAIN output, rows is "#rows to be examined". What if the optimizer does not plan to run the scan until all rows are read?

create table t1 (

  a int,

  b int,

  key(a)

);

insert into t1 select seq, seq from seq_1_to_10000;

analyze table t1;

explain select * from t1 order by a limit 10;

+------+-------------+-------+-------+---------------+------+---------+------+------+-------+

| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra |

+------+-------------+-------+-------+---------------+------+---------+------+------+-------+

|    1 | SIMPLE      | t1    | index | NULL          | a    | 5       | NULL | 10   |       |

+------+-------------+-------+-------+---------------+------+---------+------+------+-------+

rows=10 which is how many we expect to read before producing #LIMIT rows. Ok.

Now, with a quick select:

explain select * from t1

where a between 1 and 1000

order by a limit 10;

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

|    1 | SIMPLE      | t1    | range | a             | a    | 5       | NULL | 1000 | Using index condition |

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

rows=1000. It seems, LIMIT is not taken into account here.

This query might be somewhat exceptional as we first decide to use range on index a and then test_if_skip_sort_order() figures that this also produces rows in the desired order.

Another example with type=index:

alter table t1 add key(b);

explain select * from t1

where a between 1 and 1000 order by b limit 20;

+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+

| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |

+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+

|    1 | SIMPLE      | t1    | index | a             | b    | 5       | NULL | 203  | Using where |

+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+

Here, the number 203 is the # of rows to scan computed by test_if_skip_sort_order().

(We need to get LIMIT 20 rows. range access provides selectivity of WHERE clause: (quick->rows=1000)/(table_rows=10000 )=0.1. To get 20 output rows, we scan 20/0.1=200 rows).

explain

select *from t1

where a between 1 and 1000 and b between 1 and 5000

order by b limit 20;

+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+

| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |

+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+

|    1 | SIMPLE      | t1    | range | a,b           | b    | 5       | NULL | 5000 | Using where |

+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+

Note that we see rows=5000. This is the estimate of #rows in range scan over index b, without taking LIMIT into account.
Internally, the optimizer has a different idea:

                "reconsidering_access_paths_for_index_ordering": {

                  "clause": "ORDER BY",

                  "fanout": 1,

                  "read_time": 1232.379546,

                  "table": "t1",

                  "rows_estimation": 1000,

                  "possible_keys": [

                      "index": "a",

                      "can_resolve_order": false,

                      "cause": "not usable index for the query"

},

                      "index": "b",

                      "can_resolve_order": true,

                      "updated_limit": 412,

                      "range_scan_time": 243.5337535,

                      "index_scan_time": 243.5337535,

                      "records": 5000,

                      "chosen": true

},

note the updated_limit=412 .
This comes from 20 * (table_records / refkey_rows_estimate)

refkey_rows_estimate= 492 which comes from table->cond_selectivity=0.048 which is a product of quick select's selectivities.

Attachments

Issue Links

relates to

MDEV-25480 Optimizer uses wrong index

Closed

Activity

People

Assignee:: Sergei Petrunia

Reporter:: Sergei Petrunia

Votes:: 1 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2024-10-31 10:58

Updated:: 2025-03-11 08:45

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.