Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8987

Equality propagation + extended keys make range analysis very slow

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.1.9
    • 10.1
    • Optimizer
    • None

    Description

      Based on a user case provided by stephanevaroqui. I am not sure if the table definition could be publicly shared so I'm making up my own.

      Consider a query:

      explain 
      SELECT 
        t1.pk, t2.order_col, t2.str_cp, t2.pk 
      FROM 
        t1 LEFT JOIN t2 ON (t1.key1 = t2.pk)  
      WHERE 
        t2.key2 = t2.pk AND 
        t2.non_index_col = 'const2' AND 
        t1.validated >= t1.created AND 
        t1.key1 IN (<list of 2600 constants>)   
      ORDER BY 
        t2.order_col ASC;

      It works fast when extended_keys=off, but becomes much slower (including EXPLAIN) when extended_keys=ON. MySQL is also affected, starting from version 5.6. (they've got Extended Keys in 5.6).

      +------+-------------+-------+--------+---------------+-----------------+---------+---------+------+---------------------------------------------------------------------+
      | id   | select_type | table | type   | possible_keys | key             | key_len | ref     | rows | Extra                                                               |
      +------+-------------+-------+--------+---------------+-----------------+---------+---------+------+---------------------------------------------------------------------+
      |    1 | SIMPLE      | t1    | range  | key1          | key1            | 4       | NULL    | 2662 | Using index condition; Using where; Using temporary; Using filesort |
      |    1 | SIMPLE      | t2    | eq_ref | PRIMARY,key2  | PRIMARY         | 3       | t1.key1 |    1 | Using where                                                         |
      +------+-------------+-------+--------+---------------+-----------------+---------+---------+------+---------------------------------------------------------------------+

      Some obvious facts:

      • ORDER BY clause seems to be irrelevant
      • LEFT JOIN was converted into inner join because t2.non_index_col = 'const2' filters out NULL-complemented rows.

      The following are my guesses about what is going on:

      The optimizer sees these conditions
      t1.key1 IN (<list of 2600 constants>)
      t1.key1 = t2.pk
      t2.key2 = t2.pk

      And when "Extended Keys" feature is on, it also sees that INDEX(t2.key2) is
      actually INDEX(t2.key2, t2.pk).

      For that index, we can infer:
      t2.key2 IN (<list of 2600 constants>)
      t2.pk IN (<list of 2600 constants>)

      if we name constants c1,c2, ... c2600, then the optimizer will consider all kinds of possible ranges in form t2.key2= c[i] AND t2.pk=c[j]. There are in total 2600*2600=6,760,000 such ranges. I guess that it is the attempt to probe them that makes things slow.

      The optimizer fails to take into account the fact that it is known that t2.key2=t2.pk, which means we only need ranges that have i=j, and there are only 2600 such ranges.

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.