Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
Description
This task is about adding ANALYZE [FORMAT=JSON] support for PK filtering
(MDEV-16188).
User interface
Examples of PK filtering for EXPLAIN and EXPLAIN FORMAT=JSON:
id 1
|
select_type SIMPLE
|
table lineitem
|
type range|filter
|
possible_keys PRIMARY,i_l_shipdate,i_l_receiptdate,i_l_orderkey,i_l_orderkey_quantity
|
key i_l_receiptdate|i_l_shipdate
|
key_len 4|4
|
ref NULL
|
rows 6 (0%)
|
Extra Using index condition; Using where; Using filter
|
The filter shows rows="6 (0%)"
in ANALYZE, there will be "r_rows" column.
It should match the rows column, that is, display "nnn (mmm%)" where mmm% - is
the observed selectivity of the filter. If the filter was discared, we should
indicate that in some way (exactly what to print is to be determined).
With FORMAT=JSON we get more date:
+ "rowid_filter": {
|
+ "range": {
|
+ "key": "i_l_shipdate",
|
+ "used_key_parts": ["l_shipDATE"]
|
+ },
|
+ "rows": 8,
|
+ "selectivity_pct": 0.1332
|
+ },
|
ANALYZE FORMAT=JSON output should include:
- r_rows for the quick select that was used to populate the filter.
- if the filter was discarded (because it was too big), this should be indicated.
- The total time to populate (i.e. build) the filter should be included as well.
- selectivity_pct should get "r_selectivity_pct" which will show the observed filter selectivity.
- (I am not sure if we should include the time it took to make checks in the filter).
Implementation
class Explain_rowid_filter should get a pointer to a "Rowid filter tracker
object" (which will exist for all queries, both ANALYZE or not). (just like
e.g. Explain_update has a pointer to Filesort_tracker object).
Filesort_tracker is a good example of a complex tracker object.
There should be a Rowid_filter_tracker object with information about the query
execution. It should have:
- a member to track the time to populate the filter.
- a member to track the time/#rows for table access while populating the
filter.
- A member which indicates whether filter construction finished or was aborted.
- (Optionally) A member which indicates how big the buffer size was.
- Two counters: one counts how many times the filter was invoked, the other
counts how many times the filter has filter out the record. Dividing one by
another produces r_selectivity_pct.
Tracking the time to build the filter
See Explain_select::time_tracker and its use in JOIN::exec.
Tracking table access and r_rows for building the filter.
At the moment, storage engine objects (like ha_innobase) have "tracker" objects
inside them (see handler::tracker).
The question is: does MDEV-16188 use the same handler object to build the
filter (phase #1) and to perform the "primary" table access by the main query
(phase #2) ?
If yes: we will need to switch handler::tracker objects after building the
filter has been finished (as we want time/#rows for phase #1 and phase #2 to be
accounted separately)
If no: then, we'll need to set the secondary handler object's h->tracker to
point to our tracker.
Printing
void Explain_rowid_filter::print_explain_json(Explain_query *query, |
Json_writer *writer,
|
bool is_analyze) |
has "is_analyze" argument. This controls whether analyze data should be
printed.
Attachments
Issue Links
- is part of
-
MDEV-16188 Use in-memory PK filters built from range index scans
- Closed