Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14569

Query runs slower using a UNIQUE KEY vs. a KEY

Details

    Description

      I have an old Request Tracker 3.6 setup using MariaDB 10.2.10 (recently rebuilt, was on MySQL 5.5 before and did not have this issue). I noticed that a query that includes searching by the requester Name is much slower than searching by the EmailAddress. When I look at the SQL query generated, the only difference is searching by the Users.Name column vs. the Users.EmailAddress column. The Name column is a UNIQUE KEY while the EmailAddress column is just a KEY (each of the keys has just the one column). That doesn't seem like it should cause a big difference.

      I'm attaching the Users table schema as well as "explain extended" of the two queries.

      Attachments

        1. explain-email.txt
          3 kB
        2. explain-name.txt
          3 kB
        3. tables.sql
          2 kB
        4. Users.sql
          1 kB

        Issue Links

          Activity

            alice Alice Sherepa added a comment -

            please provide output of

            SHOW CREATE TABLE Tickets;
            SHOW CREATE TABLE Groups;
            SHOW CREATE TABLE CachedGroupMembers;
            

            alice Alice Sherepa added a comment - please provide output of SHOW CREATE TABLE Tickets; SHOW CREATE TABLE Groups; SHOW CREATE TABLE CachedGroupMembers;
            alice Alice Sherepa added a comment -

            It looks like difference in this execution plans is not between key vs unique key,
            but in using apid_status key vs apid_lastupdated, so whether optimizer choose index on column, used in ORDER BY clause or not.
            Please see Sergei Petrunia's explanation of this problem here: MDEV-14621

            alice Alice Sherepa added a comment - It looks like difference in this execution plans is not between key vs unique key, but in using apid_status key vs apid_lastupdated, so whether optimizer choose index on column, used in ORDER BY clause or not. Please see Sergei Petrunia's explanation of this problem here: MDEV-14621
            cmadams Chris Adams added a comment -

            Okay. Is there a reasonably-clean way to influence the optimizer to get the desired order? Unfortunately, I probably can't change the query directly (RT's query builder doesn't expose the whole query).

            Thanks.

            cmadams Chris Adams added a comment - Okay. Is there a reasonably-clean way to influence the optimizer to get the desired order? Unfortunately, I probably can't change the query directly (RT's query builder doesn't expose the whole query). Thanks.
            alice Alice Sherepa added a comment - - edited

            Hi Chris,
            I do not have the exact solution for this case, but idea is to add some coverage index, like ALTER TABLE Tickets ADD INDEX newkey (Status, Lastupdated), so the optimizer will use it instead of the others.
            Btw you have 2 identical indexes on column (`Owner`) in table Tickets (is it by design or just accidentally)

            alice Alice Sherepa added a comment - - edited Hi Chris, I do not have the exact solution for this case, but idea is to add some coverage index, like ALTER TABLE Tickets ADD INDEX newkey (Status, Lastupdated), so the optimizer will use it instead of the others. Btw you have 2 identical indexes on column (`Owner`) in table Tickets (is it by design or just accidentally)
            cmadams Chris Adams added a comment -

            Sorry, it took me a bit to get back to this... yes, that one index is purely an accident. Thanks for noting it! I did add the suggested index, and that did measurably help my query. Thanks!

            cmadams Chris Adams added a comment - Sorry, it took me a bit to get back to this... yes, that one index is purely an accident. Thanks for noting it! I did add the suggested index, and that did measurably help my query. Thanks!
            alice Alice Sherepa added a comment -

            Really happy to see your message!)

            alice Alice Sherepa added a comment - Really happy to see your message!)
            rjasdfiii Rick James added a comment -

            Name and EmailAddress differ in both NULLness and Uniqueness. Plus a CROSS JOIN is involved.

            Is this the order of actions being taken by the Optimizer to decide which index to use?

            Step 1: Does the optimizer see UNIQUE together with an "=" test and realize that there can be no more than one row from Users by Name? But when analyzing the Email case, it assumes there could be more than 1 row?

            Step 2: Does the Optimizer now decide whether to use the ORDER BY indexing versus some filtering for WHERE.

            Step 3: It probes the table and realized there is only one Email row. I see that the table Users first (in EXPLAIN); this tells me that it did discover only one row.

            rjasdfiii Rick James added a comment - Name and EmailAddress differ in both NULLness and Uniqueness. Plus a CROSS JOIN is involved. Is this the order of actions being taken by the Optimizer to decide which index to use? Step 1: Does the optimizer see UNIQUE together with an "=" test and realize that there can be no more than one row from Users by Name? But when analyzing the Email case, it assumes there could be more than 1 row? Step 2: Does the Optimizer now decide whether to use the ORDER BY indexing versus some filtering for WHERE. Step 3: It probes the table and realized there is only one Email row. I see that the table Users first (in EXPLAIN); this tells me that it did discover only one row.
            rjasdfiii Rick James added a comment -

            @Chris – I tell users "If an index can entirely handle the WHERE + GROUP BY + ORDER BY, then the LIMIT may be usefully be cut short." If not, then there is the risk that some WHERE clauses can lead to a full table scan before discovering that there are not enough rows to satisfy the LIMIT.

            Because of that, I don't like to even suggest an index on the ORDER BY column(s) unless the INDEX is good enough. (I am not talking about "covering", but, for example, all '=' and no ranges in the WHERE.)

            rjasdfiii Rick James added a comment - @Chris – I tell users "If an index can entirely handle the WHERE + GROUP BY + ORDER BY, then the LIMIT may be usefully be cut short." If not, then there is the risk that some WHERE clauses can lead to a full table scan before discovering that there are not enough rows to satisfy the LIMIT. Because of that, I don't like to even suggest an index on the ORDER BY column(s) unless the INDEX is good enough. (I am not talking about "covering", but, for example, all '=' and no ranges in the WHERE.)

            People

              alice Alice Sherepa
              cmadams Chris Adams
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.