[MDEV-15777] Use inferred IS NOT NULL predicates in the range optimizer Created: 2018-04-04 Updated: 2020-08-25 Resolved: 2019-09-03 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Fix Version/s: | 10.5.0 |
| Type: | Task | Priority: | Critical |
| Reporter: | Sergei Petrunia | Assignee: | Igor Babaev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
Consider an example (inspired by queries generated by Hibernate):
The subquery is converted into an inner join (because t1_subsets.id is a primary key).
t1 has only a few rows which have a non-NULL value for t1.subset_id. A piece of ANALYZE FORMAT=JSON output to confirm this:
However, range optimizer is not able to make use of "Early NULLs filtering". We could have used t1_subset_id index to construct range access, but we dont |
| Comments |
| Comment by Sergei Petrunia [ 2018-04-04 ] | |||||||||||||||||||||||||||
|
Basic ideas how to solve this: In range optimizer:
Other considerations: | |||||||||||||||||||||||||||
| Comment by Julien Fritsch [ 2018-04-06 ] | |||||||||||||||||||||||||||
|
Is it a bug or new feature request ? | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2018-04-10 ] | |||||||||||||||||||||||||||
|
julien.fritsch It is a feature request. | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2018-04-18 ] | |||||||||||||||||||||||||||
|
The EARLY FILTERING in range access is a bit different from ref/eq_ref access For ref access *For range access * So for ref access we check the right hand side of the equality while for the range access | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2018-04-20 ] | |||||||||||||||||||||||||||
|
First attempt to the patch here | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2018-05-15 ] | |||||||||||||||||||||||||||
|
The latest patch is in the 10.3-mdev15777 branch | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2018-05-15 ] | |||||||||||||||||||||||||||
|
Here is the new patch | |||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-06 ] | |||||||||||||||||||||||||||
|
Review feedback sent: https://lists.launchpad.net/maria-developers/msg11825.html | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-05-08 ] | |||||||||||||||||||||||||||
|
A problem that psergey caught while doing the review is adding duplicate NOT NULL conditions for a particular column The query is:
the table structure is:
so we have 2 keys that have columns a as a keypart, so both try to add a NOT NULL cond and that is why we end up with this duplication.
| |||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-09 ] | |||||||||||||||||||||||||||
|
The current patch adds NOT NULL predicates unconditionally. This might have un-intended consequences. Check out MDEV-19424: InnoDB is known for returning 50% estimates for ranges that span >50% of the table. So,
might have a negative effect. One way to work around this (was discussed on some optimizer call and discarded) : only inject NOT NULL conditions for which EITS data says that they have selectivity less than 50%. (This is much easier to do than to fix MDEV-19424). | |||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-09 ] | |||||||||||||||||||||||||||
|
Takeaway from yesterday discussion with Varun:
| |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-05-14 ] | |||||||||||||||||||||||||||
|
Explaining the problem with estimates which I saw while running the main suite Basic information
Lets break into 2 case Case 1: Without NULL rejecting conditions added for range analysis
Range createdquick range select, key groups_dt, length: 35 Case 2:
Range createdquick range select, key groups_dt, length: 70 | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-05-14 ] | |||||||||||||||||||||||||||
|
The main concern here is that we see the estimates for rows for table 'g' is 13 in case (1) The reason for this is that with NULL rejecting conditions the range is extended for key(groups_dt) and now There is a code which tries to use Range estimates for Ref access, so the idea is if we have a range that is a prefix of the ref access and the estimates for range access is lower than that of ref access, then use So for cases like above we have the where clause as
Before the null-rejecting part the range that was created was only for 1 keypart(domain) and With the null rejecting conditions added the range no more remains the prefix of the ref access and | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-05-15 ] | |||||||||||||||||||||||||||
|
On the optimizer call it was decided to solve the problem above (that is reuse range estimates when range is a prefix of ref access and the estimates for range is less than the estimates for ref access) by creating NULL rejecting conditions only for the first key part of the index. Also it was decided to introduce an optimizer_swith_flag for this optimization. | |||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-22 ] | |||||||||||||||||||||||||||
|
Provided review input for the latest patch: https://lists.launchpad.net/maria-developers/msg11841.html | |||||||||||||||||||||||||||
| Comment by Igor Babaev [ 2019-05-23 ] | |||||||||||||||||||||||||||
|
The task can be formulated in a very simple way: (1) can be resolved by finding all null rejecting fields in the WHERE condition (this is similar to how we find null rejecting tables) and choosing only those fields that participate in indexes. 'Early' null filtering is applied to the keys used for equi-joins, while in this task we need keys used for range access. | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-07-03 ] | |||||||||||||||||||||||||||
|
The patch | |||||||||||||||||||||||||||
| Comment by Igor Babaev [ 2019-09-03 ] | |||||||||||||||||||||||||||
|
This task was pushed into 10.5. | |||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-09-12 ] | |||||||||||||||||||||||||||
|
Here is the commit message for the patch pushed to 10.5
th patch adds a new optimizer_swtich not_null_range_scan, this is currently turned off. |