[MDEV-8987] Equality propagation + extended keys make range analysis very slow Created: 2015-10-22  Updated: 2017-11-05

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.1.9
Fix Version/s: 10.1

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Based on a user case provided by stephanevaroqui. I am not sure if the table definition could be publicly shared so I'm making up my own.

Consider a query:

explain 
SELECT 
  t1.pk, t2.order_col, t2.str_cp, t2.pk 
FROM 
  t1 LEFT JOIN t2 ON (t1.key1 = t2.pk)  
WHERE 
  t2.key2 = t2.pk AND 
  t2.non_index_col = 'const2' AND 
  t1.validated >= t1.created AND 
  t1.key1 IN (<list of 2600 constants>)   
ORDER BY 
  t2.order_col ASC;

It works fast when extended_keys=off, but becomes much slower (including EXPLAIN) when extended_keys=ON. MySQL is also affected, starting from version 5.6. (they've got Extended Keys in 5.6).

+------+-------------+-------+--------+---------------+-----------------+---------+---------+------+---------------------------------------------------------------------+
| id   | select_type | table | type   | possible_keys | key             | key_len | ref     | rows | Extra                                                               |
+------+-------------+-------+--------+---------------+-----------------+---------+---------+------+---------------------------------------------------------------------+
|    1 | SIMPLE      | t1    | range  | key1          | key1            | 4       | NULL    | 2662 | Using index condition; Using where; Using temporary; Using filesort |
|    1 | SIMPLE      | t2    | eq_ref | PRIMARY,key2  | PRIMARY         | 3       | t1.key1 |    1 | Using where                                                         |
+------+-------------+-------+--------+---------------+-----------------+---------+---------+------+---------------------------------------------------------------------+

Some obvious facts:

  • ORDER BY clause seems to be irrelevant
  • LEFT JOIN was converted into inner join because t2.non_index_col = 'const2' filters out NULL-complemented rows.

The following are my guesses about what is going on:

The optimizer sees these conditions
t1.key1 IN (<list of 2600 constants>)
t1.key1 = t2.pk
t2.key2 = t2.pk

And when "Extended Keys" feature is on, it also sees that INDEX(t2.key2) is
actually INDEX(t2.key2, t2.pk).

For that index, we can infer:
t2.key2 IN (<list of 2600 constants>)
t2.pk IN (<list of 2600 constants>)

if we name constants c1,c2, ... c2600, then the optimizer will consider all kinds of possible ranges in form t2.key2= c[i] AND t2.pk=c[j]. There are in total 2600*2600=6,760,000 such ranges. I guess that it is the attempt to probe them that makes things slow.

The optimizer fails to take into account the fact that it is known that t2.key2=t2.pk, which means we only need ranges that have i=j, and there are only 2600 such ranges.



 Comments   
Comment by Sergei Petrunia [ 2015-10-22 ]

A possible way to fix this is to amend the range optimizer (opt_range.cc: key_and() specifically) to be aware about present multiple equalities between key parts.

Comment by Sergei Petrunia [ 2015-10-22 ]

Also waiting on stephane@skysql.com (or is the user name stephanevaroqui) to confirm we will be able to try out the patch.

Generated at Thu Feb 08 07:31:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.