[MDEV-446] Reuse quick_condition_rows in fanout computations - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Fix Version/s: 10.0.2
Component/s: None
Labels:
None

Description

Currently:

range optimizer produces quick_condition_rows
they are used by best_access_path(), but only if it considers
full table scan acccess (see matching_candidates_in_table() call)

We need to make it so that quick_condition_rows is used for any access
method.

=== Details ===
Currently, quick_condition_rows is only used for full table scans. This makes the code simple.
When we start using quick_condition_rows with arbitrary acess method, we will be faced with this problem:

1. quick_condition_rows estimate was made from a certain part of WHERE clause (the part of WHERE that was used to construct a range)
2. access method that we're using also was made from a certain part of WHERE clause.

question: are these two correlated? When does the estimate of the fanout of an access method take into account quick_condition_rows?

We will adopt this answer:

Don't take quick_condition_rows into account if we're using a quick select for table access,
or if we're using ref(const) (in this case we would construct a quick select and take its estimates into account).
Do adopt it otherwise.

THere are more complex cases like
t.keypart1=const AND t.keypart2 BETWEEN 'foo' AND 'bar' AND t.keypart2=othertbl.col

Here:
quick_condition_rows is made from "t.keypart1=const AND t.keypart2 BETWEEN ..."
ref access will be on "t.keypart1=const AND t.keypart2=othertbl.col"

when part of WHERE will be appled,

t.keypart1=const – will not filter anything out (because it's guaranteed to be true
by ref access)
t.keypart2 BETWEEN ... – will filter something out.

Available estimates:
E1. rec_per_key(t.keypart1=...)
E2. rec_per_key("t.keypart1=... AND t.keypart2=...")
E3. records_in_range("t.keypart1=const AND t.keypart2 BETWEEN ...")

Note, that the estimates will not necessary be in sync. For instance, we saw real-world cases where E3 > E1, even if it is clear that E3 < E1 for any dataset.

How to compute the fanout of the WHERE predicates not covered by an access method:

if an access method covers all WHERE predicates, then the WHERE clause doesn't have any selectivity on its own.
if WHERE's selectivity hasn't been "absorbed" by the employed access method, then it has selecivity on its own.

We assume that:

ref(non-const) access's selecitivity is orthogonal to quick_condition_rows selecitivity.
full table scan has no selectivity => WHERE condition will apply quick_condition_rows selecitivity.
That's it, for now. More complex analysis is possible but will be done outside of this task.
For any type of access other than the above two, assume quick_condition_rows doesn't have any additional selectivity - selectivity(WHERE)=1.

== Implementation ==

All code should be in best_access_path().
QUESTIONABLE:: The result should be recorded in POSITION::records_read.
(Yes, there will be cases where POSITION::records_read < 1)

APPARENT: POSITION::prefix_record_count is not what is needed.

The result should be visible in EXPLAIN EXTENDED, in "filtered" column.

== Implementation step #1 ==

Take the problematic query from mdev-402
Remove the scalar subquery predicate.
Convert both IN-subqueries into a JOIN ( I don't care if it's not equivalent)
Use STRAIGHT_JOIN hints, if necessary to produce the join order of
nation,supplier,partsupp,part.
Make it so that for table part, in EXPLAIN:
= access is eq_ref
= rows==1 (it's eq_ref)
= "filtered shows" 0.0118 * 100 = 1.18 (percent).
This may require introducing extra members in POSITION (investigate).

Attachments

Issue Links

is part of

MDEV-402 Take into account the cost of subqueries during optimization

Closed

Sub-Tasks

Change JOIN_TAB::records_read from ha_rows to double

Closed

Timour Katchaounov (Inactive)

Activity

People

Assignee:: Timour Katchaounov (Inactive)

Reporter:: Sergei Petrunia

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2012-08-10 12:03

Updated:: 2013-03-26 13:01

Resolved:: 2013-03-26 13:01

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server