[MDEV-21383] Possible range plan is not used under certain conditions Created: 2019-12-22 Updated: 2020-08-25 Resolved: 2020-01-24 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Affects Version/s: | 10.3.10, 5.5.65, 10.3.20 |
| Fix Version/s: | 10.3.22, 10.4.12, 10.5.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | Sergei Petrunia | Assignee: | Sergei Petrunia |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Description |
|
This is from the same customer issue as It is possible to construct a join query which will use full table scan for one of the tables, even if there is a possible range access for that table. I'm not sure if table definitions are public so I will not post the details in public until we've figured a general case to demonstrate the issue. The issue is reproducible on 10.3.{10,20} and 5.5.65. Not reproducible on 5.2.14 Symptoms: The problem shows up here in make_join_select
The if-branch is taken, inside it, SQL_SELECT::test_quick_select is called which doesn't construct the range access anymore (can't use multiple equalities?) and this is how we end up with a full table scan. |
| Comments |
| Comment by Sergei Petrunia [ 2020-01-18 ] | |||||||||||||||||||||||||||||||||||||
|
MTR testcase: https://gist.github.com/spetrunia/d217e9d7df562e575afec8f8221be023 | |||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-01-18 ] | |||||||||||||||||||||||||||||||||||||
igor Did you check the commit comment also? Anyway, let me go through it again. I'll be using the simplified testcase from https://gist.github.com/spetrunia/d217e9d7df562e575afec8f8221be023 Consider a table
and a query
Let's follow what the optimizer does for it. Step 1. t2.stationId = t1.stationId is used to construct a multiple equality.
due to the multiple equality it is able to infer
and construct ranges
This has about 203 rows, so a quick select is built for it. Step 3. Join optimizer is ran, it picks the join order: t1,t2,t3. For t2, it chooses to use the quick select. Step 4. Execution enters make_join_select() function and reaches this point:
where tab points to table t2's JOIN_TAB. Step 5: Execution proceeds to make this call a few lines below:
Note that the parameter to this range optimizer call is the condition that is attached to table t2:
The condition "t1.stationid=1 or t1.stationid=2 or t1.stationid=3" is not present here as it is attached to table t1 (and so is not attached to table t2). Step 5.1 Because of this , the range optimizer is not able to construct a range access over the (stationId, startTime), and makes t2 use a full table scan. Where did things go wrong?
So, the patch is "conservative". Its action is:
| |||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-01-18 ] | |||||||||||||||||||||||||||||||||||||
Let me go through that again also. Let's recall Step #4: Step 4. Execution enters make_join_select() function and reaches this point:
Look at line #4:
Here, in 5.2, we have
which means tab->keys.is_subset(tab->const_keys)==true, which causes the OTOH in 10.3 we have:
which means tab->keys.is_subset(tab->const_keys)==false, which means the If one looks at the query text, the only predicate with t2.key2 is | |||||||||||||||||||||||||||||||||||||
| Comment by Igor Babaev [ 2020-01-19 ] | |||||||||||||||||||||||||||||||||||||
|
Sergei, | |||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-01-19 ] | |||||||||||||||||||||||||||||||||||||
|
igor, attempt to implement this suggestion exposes another issue. Please find the description below. So, the code from Step #6 now gets invoked with this condition:
The structure of the condition is such that range optimizer performs operations in the following order:
and the result is that
that is, the SEL_ARG for "t2.startTime <= 100" is discarded. The reason for that is:
then, we call tree_and() for those, and the execution reaches and_all_keys() which discards $tree3->keys[0] because it has a bigger key part number. I've made an obvious change to fix this (discard the SEL_ARG with type=MAYBE_KEY even if its part # is smaller). The patch is here: | |||||||||||||||||||||||||||||||||||||
| Comment by Igor Babaev [ 2020-01-20 ] | |||||||||||||||||||||||||||||||||||||
|
Sergei, | |||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-01-24 ] | |||||||||||||||||||||||||||||||||||||
|
igor wrote:
I don't see any need to do this. Proof t1.col=t2.col=... First, let's assume that the multiple equality doesn't include a constant. Second, let's assume that multiple equality does include a constant. In this t1.col=const and they will get attached to their respective tables (and so will be usable by Is there something missing in the above logic? | |||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-01-24 ] | |||||||||||||||||||||||||||||||||||||
|
Got ok to push after discussion |