[MDEV-5950] EITS: bad estimate for very skewed distributions Created: 2014-03-25 Updated: 2014-03-26 Resolved: 2014-03-26 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 10.0.9 |
| Fix Version/s: | 10.0.10 |
| Type: | Bug | Priority: | Major |
| Reporter: | Sergei Petrunia | Assignee: | Sergei Petrunia |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | eits | ||
| Issue Links: |
|
||||||||
| Description |
|
After fix for
filtered% used to be 50%, now it's 99.22%. Both estimates are very wrong:
But the new one is even worse than before. |
| Comments |
| Comment by Sergei Petrunia [ 2014-03-25 ] | ||||||||||||||
|
So, the histogram shows that all values are "1" (or very close to 1). We should be able to determine that value of 0 (which is equal to minimum) occupies zero buckets, not 99.22% of buckets. | ||||||||||||||
| Comment by Sergei Petrunia [ 2014-03-25 ] | ||||||||||||||
|
This looks like an edge case, but this is an edge case that I think we could (and should) handle. | ||||||||||||||
| Comment by Sergei Petrunia [ 2014-03-25 ] | ||||||||||||||
|
Debugging: Histogram::point_selectivity (this=0x7fffca8829f8, pos=0, avg_sel=0.5) sel= avg_sel * used_buckets / avg_buckets_per_value; Histogram::point_selectivity() is invoked with pos=0. This is correct. Proposed solution: when we're looking at the cell, compare its left endpoint with the right endpoint. if our constant is closer to the left endpoint, assume that the cell is unoccupied. | ||||||||||||||
| Comment by Sergei Petrunia [ 2014-03-26 ] | ||||||||||||||
|
The issue is fixed by the new patch for -1 SIMPLE t1 ALL NULL NULL NULL NULL 1025 49.61 Using where i.e. the patch improves the estimation. |