[MDEV-21554] Crash in JOIN_CACHE_BKAH::skip_index_tuple when mrr=on and join_cache_level=6+ Created: 2020-01-22 Updated: 2020-08-25 Resolved: 2020-03-20 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Affects Version/s: | 10.3.22, 10.4.12, 10.5.1 |
| Fix Version/s: | 10.3.23, 10.4.13, 10.5.2 |
| Type: | Bug | Priority: | Major |
| Reporter: | Valerii Kravchuk | Assignee: | Igor Babaev |
| Resolution: | Fixed | Votes: | 3 |
| Labels: | crash, mrr, regression | ||
| Attachments: |
|
| Description |
|
Crash with the following backtrace:
happens for INSERT ... SELECT accessing several tables when these options are set:
It was not the case fro the same query and data in older 10.2.x and 10.3.x versions. |
| Comments |
| Comment by Igor Babaev [ 2020-01-24 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Analysis: 1.In the function ha_partition::multi_range_key_create_key() the code
assigns the pointer m_mrr_range_current of the type PARTITION_KEY_MULTI_RANGE* 2. In the function partition_multi_range_key_next() the code
assigns to range.ptr the value of partition_key_multi_range->key_multi_range.ptr that is not 3. Finally in the function Mrr_simple_index_reader::init() the call in the code
brings us to execution of the code
that returns some garbage from mrr_cur_range.ptr; | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-01-26 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
igor, I've looked again at the patch I was trying yesterday, and figured it had a basic error. After I fix it, the testcase no longer crashes and returns the expected number of rows. I'm attaching the patch: mdev21544-fix-crash.diff The patch is missing:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Igor Babaev [ 2020-02-01 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This is a setting that is simpler than the one used in the bug report and yet with this setting the bug still manifests itself though in a different way:
(The last line is not necessary with default settings in 10.3). Here's a simple test case that demonstrates the problem:
The tables t0 and t1 are of the same structure and populated with the same set of rows. The only difference is that t1 is partitioned.
and that the execution employs index condition pushdown
At the same time for the first query the result is correct:
and to get this result the executor uses index condition pushdown as well
The correct result set for the second query we get with the setting
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-02-04 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
igor, on the question of whether range_info in the sequence of ranges passed from ha_partition to individual partitions could be the same as range_info that SQL layer has passed to ha_partition: Here are the places where ha_partition makes use of range_info passed to individual partitions. (I'll explain the reasoning below) in ha_partition::handle_ordered_index_scan:
in ha_partition::handle_ordered_next:
The idea here is to get the number of the range we've got the record from. This is used in "Ordered MRR scans". An explanation about how "Ordered MRR scans" are done: Ordered MRR scan an MRR scan done with HA_MRR_SORTED flag. it has these properties:
ha_partition supports Ordered MRR scans, if the underlying partitions do. It does it as follows:
Here, the "earliest" range is the one with key=2, and the partitions that have it are p1 and p2.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-02-04 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Now, one may ask a question, why can't we just merge the ordered streams? The only explanation I have is that the above code will do less key comparisons. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-02-04 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I'll commit a patch shortly with comments for ha_partition class members. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Igor Babaev [ 2020-02-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The bug was introduced in the commit 8eeb689e9fc57afe19a8dbff354b5f9f167867a9: The bug is in the code of the function partition_multi_range_key_skip_index_tuple(). Here's how we can reproduce the bug in partition_multi_range_key_skip_record().
and populate it with the same data as in the previous test case.
Use the following table t2:
After executing the statement
and changing the setting to use BKA+MRR
run the queries
For the first query you have an empty result set and this is expected:
For the second query you have the result set
and this is incorrect. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Igor Babaev [ 2020-03-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
A fix for this bug was pushed into 10.3. |