[MDEV-19424] InnoDB's records_in_range estimates are capped at about 50% Created: 2019-05-09 Updated: 2023-04-27 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Optimizer, Storage Engine - InnoDB |
| Affects Version/s: | 10.1, 10.2, 10.3, 10.4 |
| Fix Version/s: | 10.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Sergei Petrunia | Assignee: | Sergei Petrunia |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | records_in_range | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
InnoDB's records_in_range estimates seem to be capped at ~50% of the table. (We used to observe this on various occasions before but I haven't been able to find an MDEV for this). If I pass a range that contains a bigger fraction of the table, the estimated number of rows is still around 50% of the total rows in the table. Test dataset:
Now, both a IS NOT NULL or b IS NOT NULL match 900K rows (90% of the table).
When the optimizer was simple, this property was not a problem. A simple optimizer would only use range estimates to construct range access. Range access is cheaper than full table if it covers about 30% of the table. Returning 50% of the table instead of 90% was not an issue. A more advanced optimizer also attempts to use range estimates for condition selectivity, etc. Here, returning 50% selectivity instead of 90% is a problem. (One must take into account that selectivity is computed for multiple indexes. For example, for 5 indexes 0.5^2= 1/32 . 32x under-estimation of selectivity) |
| Comments |
| Comment by Sergei Petrunia [ 2019-05-09 ] |
|
According to marko, this part of InnoDB code has not been touched by anyone for a very long time. |
| Comment by Marko Mäkelä [ 2019-05-28 ] |
|
In addition to the systematic error of 50%, ha_innobase::records_in_range() can also return completely made-up numbers due to race condition scenarios caused by too minimal locking. Refer to inexact: in btr_estimate_n_rows_in_range_on_level() and diverged in btr_estimate_n_rows_in_range_low(). |
| Comment by Andrei Elkin [ 2019-06-28 ] |
|
|
| Comment by Valerii Kravchuk [ 2020-06-28 ] |
|
See also upstream https://bugs.mysql.com/bug.php?id=73386. |