[MDEV-18144] ANALYZE for statement support for PK filters - Jira

XML

Word

Printable

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Fix Version/s: 10.4.3
Component/s: Optimizer
Labels:
None

Description

This task is about adding ANALYZE [FORMAT=JSON] support for PK filtering
(~~MDEV-16188~~).

User interface

Examples of PK filtering for EXPLAIN and EXPLAIN FORMAT=JSON:

id     1

select_type    SIMPLE

table  lineitem

type   range|filter

possible_keys  PRIMARY,i_l_shipdate,i_l_receiptdate,i_l_orderkey,i_l_orderkey_quantity

key    i_l_receiptdate|i_l_shipdate

key_len        4|4

ref    NULL

rows   6 (0%)

Extra  Using index condition; Using where; Using filter

The filter shows rows="6 (0%)"

in ANALYZE, there will be "r_rows" column.
It should match the rows column, that is, display "nnn (mmm%)" where mmm% - is
the observed selectivity of the filter. If the filter was discared, we should
indicate that in some way (exactly what to print is to be determined).

With FORMAT=JSON we get more date:

+      "rowid_filter": {

+        "range": {

+          "key": "i_l_shipdate",

+          "used_key_parts": ["l_shipDATE"]

+        },

+        "rows": 8,

+        "selectivity_pct": 0.1332

+      },

ANALYZE FORMAT=JSON output should include:

r_rows for the quick select that was used to populate the filter.
if the filter was discarded (because it was too big), this should be indicated.
The total time to populate (i.e. build) the filter should be included as well.
selectivity_pct should get "r_selectivity_pct" which will show the observed filter selectivity.
(I am not sure if we should include the time it took to make checks in the filter).

Implementation

class Explain_rowid_filter should get a pointer to a "Rowid filter tracker
object" (which will exist for all queries, both ANALYZE or not). (just like
e.g. Explain_update has a pointer to Filesort_tracker object).

Filesort_tracker is a good example of a complex tracker object.
There should be a Rowid_filter_tracker object with information about the query
execution. It should have:

a member to track the time to populate the filter.
a member to track the time/#rows for table access while populating the
filter.

A member which indicates whether filter construction finished or was aborted.
(Optionally) A member which indicates how big the buffer size was.

Two counters: one counts how many times the filter was invoked, the other
counts how many times the filter has filter out the record. Dividing one by
another produces r_selectivity_pct.

Tracking the time to build the filter

See Explain_select::time_tracker and its use in JOIN::exec.

Tracking table access and r_rows for building the filter.

At the moment, storage engine objects (like ha_innobase) have "tracker" objects
inside them (see handler::tracker).

The question is: does ~~MDEV-16188~~ use the same handler object to build the
filter (phase #1) and to perform the "primary" table access by the main query
(phase #2) ?

If yes: we will need to switch handler::tracker objects after building the
filter has been finished (as we want time/#rows for phase #1 and phase #2 to be
accounted separately)

If no: then, we'll need to set the secondary handler object's h->tracker to
point to our tracker.

Printing

void Explain_rowid_filter::print_explain_json(Explain_query *query,

                                              Json_writer *writer,

                                              bool is_analyze)

has "is_analyze" argument. This controls whether analyze data should be
printed.

Attachments

Issue Links

is part of

MDEV-16188 Use in-memory PK filters built from range index scans

Closed

Activity

People

Assignee:: Galina Shalygina

Reporter:: Sergei Petrunia

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2019-01-05 18:07

Updated:: 2019-02-24 20:44

Resolved:: 2019-02-24 20:44

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.