[MDEV-30654] Presence of a low cost index that does match the order by causes the optimizer to choose a sub-optimal order by index Created: 2023-02-14  Updated: 2024-02-07

Status: Confirmed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.4.20, 10.4.27, 10.6.11
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Critical
Reporter: Rob Schwyzer Assignee: Michael Widenius
Resolution: Unresolved Votes: 1
Labels: triage
Environment:

CentOS 7 Baremetal


Issue Links:
Relates
relates to MDEV-30532 Wrong index chosen by the optimizer f... Confirmed

 Description   

Opening this based on SAMU-98

Please refer to that link for repro instructions.

General concept is-

SELECT id
FROM   table
WHERE  varcharcol = 'avalue'
       AND varbool = 1
       AND vargroup = 'auuid'
ORDER  BY varnumber
LIMIT  0, 20;

Where-

CREATE TABLE `table` (
  `id` char(32) NOT NULL DEFAULT '',
  `varcharcol` varchar(80) DEFAULT NULL,
  `anothervarcharcol` varchar(255) DEFAULT NULL,
  `adatetime` datetime DEFAULT NULL,
  `varbool` tinyint(1) DEFAULT NULL,
  `vargroup` varchar(32) DEFAULT NULL,
  `varnumber` varchar(130) DEFAULT NULL,  
  PRIMARY KEY (`id`),
  KEY `key1` (`vargroup`,`varbool`),
  KEY `key2` (`varcharcol`,`adatetime`),
  KEY `key3` (`varcharcol`,`varnumber`,`anothervarcharcol`),
  KEY `key4` (`varbool`,`varcharcol`,`varnumber`)  
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Note that the rows present in the table matter for the repro to work. Please use the data attached to the linked issue.

WHEN optimizer_use_condition_selectivity=1 (since 10.4.2- in 10.4.1, behavior is the same when optimizer_use_condition_selectivity=4), performing the above repro results in MariaDB's optimizer picking key3. This key is rated poorly for the WHERE condition compared to key1 or key4. However, it is "usable" for the ORDER BY clause. Still, key4 would fit the WHERE and ORDER BY better, and should be chosen.

On the hardware we tested this on, when key4 or even key1 are chosen, execution occurs in less than 1 second. When key3 is chosen, execution takes multiple seconds. Most often we see a three order of magnitude gap between "good" and "bad" behavior.

There is another weirdness to this where if you add IGNORE INDEX(key2) which is not a candidate key and should not be involved, MariaDB's optimizer magically corrects its behavior and picks key4 instead of key3.

The problem seems to boil down to test_if_cheaper_ordering and this code-

      if (best_key < 0 ||
          (select_limit <= MY_MIN(quick_records,best_records) ?
           keyinfo->user_defined_key_parts < best_key_parts :
           quick_records < best_records) ||
          (!is_best_covering && is_covering))

Filing this as a regression as 10.2.27 does not have this problem and that should also be following logic like optimizer_use_condition_selectivity=1, so a regression occurred since then.



 Comments   
Comment by Sasha Pachev [ 2023-02-23 ]

Proposed fix:

diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index ccfe9bb..5629a26 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -28507,7 +28507,7 @@ static bool get_range_limit_read_cost(const JOIN_TAB *tab,
               (select_limit <= MY_MIN(quick_records,best_records) ?
                keyinfo->user_defined_key_parts < best_key_parts :
                quick_records < best_records) ||
-              (!is_best_covering && is_covering))
+              (!is_best_covering && is_covering) || (keyinfo->user_defined_key_parts >= best_key_parts && select_limit <= M
           {
             possible_key.add("chosen", true);
             best_key= nr;

Generated at Thu Feb 08 10:17:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.