[MDEV-18478] ANALYZE for statement should show selectivity of pushed index condition - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Fix Version/s: 11.5.1
Component/s: Optimizer
Labels:

Description

(filing his is a followup for the optimizer call on January, 29th)

The EXPLAIN output has:

rows - expected # of total rows enumerated
filtered - expected condition selectivity

Additional members from ANALYZE:

r_rows - shows how many rows were returned to the SQL layer (that is, after pushed index condition was applied (the value will be lower))
r_filtered - shows the selectivity of the condition that is applied at the SQL layer (that is, the selectivity of pushed index condition is not counted, the value will be higher)

Considerations for fixing this:

ANALYZE FORMAT=JSON could show #rows enumerated, and both selectivities
For tabular ANALYZE, we are limited to two numbers.
rows vs r_rows and filtered vs r_filtered should be apples-to-apples comparisons.

Implementation

Current handler statistics/counters

There are multiple ways different statistics about actions inside handler make it to ANALYZE output:

Option 1: ha_handler_stats.h:class ha_handler_stats

This was introduced recently to produce r_engine_stats.
It seems, there is no issue if this structure had another member which handler_index_cond_check would increment...

Option 2: handler->tracker approach

This approach is used by a lot of ANALYZE FORMAT=JSON counters.

Read access time is counted using this member of class handler:

  /* ANALYZE time tracker, if present */

  Exec_time_tracker *tracker;

It points to a member of class Explain_table_access:

  Table_access_tracker tracker;

This works, but do we want to add another member like 'handler::icp_tracker' ?

Option 3: Rowid Filter's approach

Storage engine code calls handler_rowid_filter_check() which calls Range_rowid_filter::check() through

h->pushed_rowid_filter->check(...)

call. Range_rowid_filter::check increments Rowid_filter::tracker.

The key part is that class handler has

  Rowid_filter *pushed_rowid_filter;

which is the context and has the counter.

For ICP, we have instead:

  Item *pushed_idx_cond;

  uint pushed_idx_cond_keyno;  /* The index which the above condition is for */

messing with cleaning/restoring pushed_idx_cond might be hard.

Which approach to use

*Option1* seems the best choice...

Output

TODO.
Considerations:

One should be able to compare rows and r_rows and it should be apples-to-apples comparison.
rows is number-of-rows-to-enumerate: before rowid filtering, before ICP.
so, r_rows should be the same.

There is also filtered which is... not as clearly defined? Should it be:

%of rows left after any kind of filtering?
%of rows left after checking the attached_condition?
(TODO: check what happens when we have an extra pushed-index-condition which selectivity is known to the optimizer).
(TODO: check what happens when using a rowid filter. Is its selectivity included in filtered% or not?)

In any case, we need something like r_icp_filtered.
The question is, should r_filtered include r_icp_filtered / 100% as a multiplier or not.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mdev18478-show-analyze.diff
6 kB
2024-04-24 08:22

Issue Links

relates to

MDEV-33054 ANALYZE FORMAT=JSON: rowid_filter.r_lookups should be average, not total

Stalled

Activity

People

Assignee:: Sergei Petrunia

Reporter:: Sergei Petrunia

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2019-02-05 10:59

Updated:: 2024-04-30 08:04

Resolved:: 2024-04-24 08:23

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.