[MDEV-14569] Query runs slower using a UNIQUE KEY vs. a KEY Created: 2017-12-04  Updated: 2022-08-16  Resolved: 2018-03-05

Status: Closed
Project: MariaDB Server
Component/s: Optimizer, Storage Engine - InnoDB
Affects Version/s: 10.2.10
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: Chris Adams Assignee: Alice Sherepa
Resolution: Duplicate Votes: 0
Labels: None
Environment:

CentOS 7.4 x86_64


Attachments: File Users.sql     Text File explain-email.txt     Text File explain-name.txt     File tables.sql    
Issue Links:
Duplicate
duplicates MDEV-8306 Complete cost-based optimization for ... Stalled

 Description   

I have an old Request Tracker 3.6 setup using MariaDB 10.2.10 (recently rebuilt, was on MySQL 5.5 before and did not have this issue). I noticed that a query that includes searching by the requester Name is much slower than searching by the EmailAddress. When I look at the SQL query generated, the only difference is searching by the Users.Name column vs. the Users.EmailAddress column. The Name column is a UNIQUE KEY while the EmailAddress column is just a KEY (each of the keys has just the one column). That doesn't seem like it should cause a big difference.

I'm attaching the Users table schema as well as "explain extended" of the two queries.



 Comments   
Comment by Alice Sherepa [ 2018-01-18 ]

please provide output of

SHOW CREATE TABLE Tickets;
SHOW CREATE TABLE Groups;
SHOW CREATE TABLE CachedGroupMembers;

Comment by Alice Sherepa [ 2018-01-19 ]

It looks like difference in this execution plans is not between key vs unique key,
but in using apid_status key vs apid_lastupdated, so whether optimizer choose index on column, used in ORDER BY clause or not.
Please see Sergei Petrunia's explanation of this problem here: MDEV-14621

Comment by Chris Adams [ 2018-01-19 ]

Okay. Is there a reasonably-clean way to influence the optimizer to get the desired order? Unfortunately, I probably can't change the query directly (RT's query builder doesn't expose the whole query).

Thanks.

Comment by Alice Sherepa [ 2018-01-24 ]

Hi Chris,
I do not have the exact solution for this case, but idea is to add some coverage index, like ALTER TABLE Tickets ADD INDEX newkey (Status, Lastupdated), so the optimizer will use it instead of the others.
Btw you have 2 identical indexes on column (`Owner`) in table Tickets (is it by design or just accidentally)

Comment by Chris Adams [ 2018-02-27 ]

Sorry, it took me a bit to get back to this... yes, that one index is purely an accident. Thanks for noting it! I did add the suggested index, and that did measurably help my query. Thanks!

Comment by Alice Sherepa [ 2018-03-05 ]

Really happy to see your message!)

Comment by Rick James [ 2022-08-16 ]

Name and EmailAddress differ in both NULLness and Uniqueness. Plus a CROSS JOIN is involved.

Is this the order of actions being taken by the Optimizer to decide which index to use?

Step 1: Does the optimizer see UNIQUE together with an "=" test and realize that there can be no more than one row from Users by Name? But when analyzing the Email case, it assumes there could be more than 1 row?

Step 2: Does the Optimizer now decide whether to use the ORDER BY indexing versus some filtering for WHERE.

Step 3: It probes the table and realized there is only one Email row. I see that the table Users first (in EXPLAIN); this tells me that it did discover only one row.

Comment by Rick James [ 2022-08-16 ]

@Chris – I tell users "If an index can entirely handle the WHERE + GROUP BY + ORDER BY, then the LIMIT may be usefully be cut short." If not, then there is the risk that some WHERE clauses can lead to a full table scan before discovering that there are not enough rows to satisfy the LIMIT.

Because of that, I don't like to even suggest an index on the ORDER BY column(s) unless the INDEX is good enough. (I am not talking about "covering", but, for example, all '=' and no ranges in the WHERE.)

Generated at Thu Feb 08 08:14:36 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.