[MDEV-21136] InnoDB's records_in_range estimates can be way off Created: 2019-11-23 Updated: 2023-06-28 Resolved: 2022-07-25 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6 |
| Fix Version/s: | 10.6.9, 10.7.5, 10.8.4, 10.9.2, 10.10.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Sergei Petrunia | Assignee: | Vladislav Lesin |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | records_in_range | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Description |
|
It is expected that the range estimates that InnoDB returns from records_in_range call are not exact. Returning about 2x more or 2x less than actual value is considered to be still ok. However, we are observing a 4.8x difference on a fairly large table. Unfortunately, the table is at the customer site and cannot be provided here. |
| Comments |
| Comment by Marko Mäkelä [ 2020-03-09 ] | ||
|
I filed MDEV-21895 for a task of refactoring ha_innobase::records_in_range() to address the problems that I am aware of. It might be possible to do something in earlier versions, especially if a repeatable test case can be provided. Note that there are also some strange things around handler::stats.records in handler::info(), as noted in | ||
| Comment by Marko Mäkelä [ 2022-05-06 ] | ||
|
Related to this, BTR_ESTIMATE and btr_cur_t::path_arr should be removed. They are only being used in the function btr_estimate_n_rows_in_range_low(). That function had better implement a specialized version of btr_cur_search_to_nth_level() to gather the path information. | ||
| Comment by Matthias Leich [ 2022-07-04 ] | ||
|
| ||
| Comment by Vladislav Lesin [ 2023-06-26 ] | ||
|
See the slides to understand pages latching order during estimation. | ||
| Comment by Sergei Petrunia [ 2023-06-27 ] | ||
|
vlad.lesin, what are p1 and p2 there? Do I get it right that the slides describe locking for records_in_range(p1, p2) ? Looking at slides #18 and #19... so, for records_in_range('aaa', 'zzz') the code will obtain latches for all pages that hold values between 'aaa' and 'zzz' ? Even if that is thousands of pages? | ||
| Comment by Vladislav Lesin [ 2023-06-28 ] | ||
|
> what are p1 and p2 there? > Looking at slides #18 and #19... so, for records_in_range('aaa', 'zzz') the code will obtain latches for all pages that hold values between 'aaa' and 'zzz' ? Even if that is thousands of pages? No, there is some limit on pages count(9 pages) for per-level walking, if the right page is not reached and the the limit was exceeded, the number of rows is estimated approximately by some formula, otherwise the estimation is exact. See btr_estimate_n_rows_in_range_on_level(), it has quite detailed description. And, as you can see on the slides, the latches are released when we leave some page. The maximum amount of latched pages(red rectangles in the slides) at the same time is 4. |