[MDEV-19474] Histogram statistics are used even with optimizer_use_condition_selectivity=3 Created: 2019-05-14 Updated: 2021-06-10 Resolved: 2021-02-16 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Affects Version/s: | 10.1, 10.2, 10.3, 10.4 |
| Fix Version/s: | 10.2.38, 10.3.29, 10.4.19, 10.5.10 |
| Type: | Bug | Priority: | Major |
| Reporter: | Varun Gupta (Inactive) | Assignee: | Varun Gupta (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Description |
|
Dataset
So with optimizer_use_condition_selectivity=4 , we get filtered as 39.84
So with optimizer_use_condition_selectivity=3 , we get * filtered as 39.84*
So the problem here is we don't stop using histogram statistics with optimizer_use_condition_selectivity=3. |
| Comments |
| Comment by Varun Gupta (Inactive) [ 2020-05-31 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
So here the case is when optimizer_use_condition_selectivity=4 , we read the histogram statistics and then we try to use these statistics even when the optimizer_use_condition_selectivity=3, I think we need to add a check of not using the available histogram statistics for optimizer_use_condition_selectivity=3. | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2020-05-31 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Patch | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2020-06-01 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
for the first patch the argument was to fix it in a way such that we cache the result if the histogram can be used or not. Made a small patch for this: | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2021-01-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Patch ok | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2021-01-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
ok to push after review comments are handled | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2021-02-02 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Latest patch addressing the review | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2021-02-02 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Analysis
this is run and so we have EITS for table t1 in the statistical tables. So lets try to explain the cases:
So for the execution #1 the statistics in the TABLE structure are not read from
which return false
and we set statistics_for_tables_is_needed= true; Now comes to the part where we start to read the statistics from table share We have this snippet:
So as optimizer_use_condition_selectivity > 3 here we would also read Now coming up with the second case:
So for the execution #2 we read the TABLE structure from the internal memory
which returns true as the stats have been read
and then we execute the branch inside this if condition
so tl->table->stats_is_read is set to TRUE Now when we call *get_column_range_cardinality * which is called when we calculate selectivity for the predicate a= 2 then we check
which returns true and we use the histograms to evaluate the selectivity for a=2 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Igor Babaev [ 2021-02-03 ] | ||||||||||||||||||||||||||||||||||||||||||||||||
|
I think that the following straightforward patch is the most appropriate for resolution of this bug:
|