[MDEV-17425] Improve range optimizer to stop evaluating cost using index dive when reaching the cost of any competing EQU_REF range Created: 2018-10-11 Updated: 2018-11-13 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Affects Version/s: | 10.1.24 |
| Fix Version/s: | 10.1 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Vincent Pelletier | Assignee: | Sergei Petrunia |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
GNU/Linux |
||
| Issue Links: |
|
||||||||
| Description |
|
The following query pattern: I would like to thank Stephane Varoqui for his huge help in getting me to understand this, and for providing me with a template on how this should be reported, as it is still quite a bit above my head. Hopefully it will all make sense to mariadb developers. Cost R1 is cap by the cost of 1000 rows. Cost of R2 have minimum constant cost of the size of the range 200(index dive cost) + SUM(index dives result). For each index dive the reduction should be cancelled when it reach R1 cost if not the range can consume possibly in the worth case 200 IO on disk and end up picking the R1 anyway. Back porting https://dev.mysql.com/worklog/task/?id=7170 could also help if index does not stay in memory the cost to add 200 index dive on disk is worth the cost of 1000 pk lookup. How to reproduce (I stripped several columns which are not involved in the query, i_eq_ref is "uid" and i_ref is "security_uid"):
All the timeings below are with a hot cache (to be fair to each measure), but in reality this query will virtually always run on cold cache: it is scanning a pre-determined set of uid values, so each query will pull a fresh set of rows. With around 30 such queries running concurrently, "explain" and execution time increase to around 20 seconds (io starvation ?). Explain without "force index":
Select without "force index": 1.22 sec
Select with "force index (PRIMARY)": 0.04 sec [EDIT]: damn, formating is hard here |
| Comments |
| Comment by VAROQUI Stephane [ 2018-10-11 ] | |||||||||||||||||||||
|
Is it possible provide a plan and time without the joins only using catalog table with the conditions on it | |||||||||||||||||||||
| Comment by Vincent Pelletier [ 2018-10-12 ] | |||||||||||||||||||||
|
My bad, I should have removed these. For reference, the joins were 2 LEFT-JOINs based on catalog, and 1 inner join from one of the right-hand table to a 3rd. Here are the simplified results, on a more evenly idle database as there are much fewer users at the time I took these measures. Explain without "force index":
Select without "force index":
Explain with "force index (PRIMARY)", which is the column the optimiser ended up picking above:
Select with "force index (PRIMARY)":
(also, I accidentally found the post preview button, yay) | |||||||||||||||||||||
| Comment by VAROQUI Stephane [ 2018-10-12 ] | |||||||||||||||||||||
|
Can you make an experimentation with running EITS histogrammes. [mariadb] Just run And re run experimentation. I guess with such cold cache EITS is a must and you get it since 10.1 | |||||||||||||||||||||
| Comment by Sergei Petrunia [ 2018-10-12 ] | |||||||||||||||||||||
|
At first glance, this issue can be addressed by the fix for Can you please
| |||||||||||||||||||||
| Comment by Sergei Petrunia [ 2018-10-12 ] | |||||||||||||||||||||
|
(note to self: should we still eq_range_index_dive_limit=0 (no limit) in 10.4?) | |||||||||||||||||||||
| Comment by Julien Muchembled [ 2018-10-12 ] | |||||||||||||||||||||
|
Still slow with 10.2.18, regardless the value of eq_range_index_dive_limit:
With 10.3.10, it's fast, regardless the value of eq_range_index_dive_limit:
| |||||||||||||||||||||
| Comment by Sergei Petrunia [ 2018-10-16 ] | |||||||||||||||||||||
The WHERE condition has
and the indexes are:
which means there are 200K ranges produced? (or there's some limit?) The EXPLAIN from MariaDB-10.3 shows | |||||||||||||||||||||
| Comment by VAROQUI Stephane [ 2018-10-16 ] | |||||||||||||||||||||
|
Could it be that this patch for eq_range_index_dive_limit is only for range on unique key that would make it useless as for unique key the range is the number of values inside IN close (2) - This is a unique key, and we have conditions for all its |