[MDEV-23811] With large number of indexes optimizer chooses an inefficient plan Created: 2020-09-24  Updated: 2023-02-01  Resolved: 2020-10-08

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: None
Fix Version/s: 10.2.35, 10.3.26, 10.4.16, 10.5.7

Type: Bug Priority: Major
Reporter: Kyle Joiner (Inactive) Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: None
Environment:

10.2


Issue Links:
Problem/Incident
causes MDEV-24117 Memory management problem in statisti... Closed

 Description   

With a large number of indexes the optimizer picks the wrong path.



 Comments   
Comment by Igor Babaev [ 2020-09-25 ]

First I tried torun the reported query for 10.4 with optimizer trace enabled after the following changes for default settings:

set optimizer_switch='index_merge=off';
set optimizer_switch='rowid_filter=off';
set optimizer_use_condition_selectivity=1;

.
The optimizer trace for the query showed that no range condition was extracted for the index incident_indexi0. I changed the WHERE condition to a much simpler OR formula and still could not get range condition for the index. It's interesting that swapping operands in this formula led to the choice the wanted range scan for this index.

After this I came up with a simple test case demonstrating the problem:

create table t1 (
pk int primary key auto_increment, a int, b int,
index idx1(a), index idx2(b)
) engine=myisam;
insert into t1(a,b) values
(5,50), (1,10), (3,30), (7,70), (8,80), (4,40), (2,20), (6,60);
insert into t1(a,b) select a+10, b+100 from t1;
insert into t1(a,b) select a+20, b+200 from t1;
insert into t1(a,b) select a+30, b+300 from t1;
insert into t1(a,b) select a,b from t1;
explain select * from t1 where (a between 2 and 5) or ((a between 3 and 4) and b < 100);

The optimizer chose a full table scan:

MariaDB [test]> explain select * from t1 where (a between 2 and 5) or ((a between 3 and 4) and b < 100);
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | t1    | ALL  | idx1,idx2     | NULL | NULL    | NULL | 64   | Using where |
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+

though for the following equivalent query a range scan was chosen:

MariaDB [test]> explain select * from t1 where ((a between 3 and 4) and b < 100) or (a between 2 and 5);
+------+-------------+-------+-------+---------------+------+---------+------+------+------------------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                              |
+------+-------------+-------+-------+---------------+------+---------+------+------+------------------------------------+
|    1 | SIMPLE      | t1    | range | idx1,idx2     | idx1 | 5       | NULL | 4    | Using index condition; Using where |
+------+-------------+-------+-------+---------------+------+---------+------+------+------------------------------------+

Comment by Sergei Petrunia [ 2020-10-02 ]

This patch was submitted for review: http://lists.askmonty.org/pipermail/commits/2020-September/014332.html

Comment by Sergei Petrunia [ 2020-10-02 ]

The main part of the patch is ok, but

The fix revealed another two bugs: ...the other in the function tree_or().

I am not sure about the fix for the one in tree_or (don't quite follow the logic).

It is:

@@ -8869,9 +8872,15 @@ tree_or(RANGE_OPT_PARAM *param,SEL_TREE *tree1,SEL_TREE *tree2)
   }
   bool no_imerge_from_ranges= FALSE;
+  SEL_TREE *rt1= tree1;
+  SEL_TREE *rt2= tree2;
   /* Build the range part of the tree for the formula (1) */ 
   if (sel_trees_can_be_ored(param, tree1, tree2, &ored_keys))
   {
+    if (no_merges1)
+      rt1= new SEL_TREE(tree1, TRUE, param);
+    if (no_merges2)
+      rt2= new SEL_TREE(tree2, TRUE, param);
     bool must_be_ored= sel_trees_must_be_ored(param, tree1, tree2, ored_keys);
     no_imerge_from_ranges= must_be_ored;
@@ -8929,12 +8938,6 @@ tree_or(RANGE_OPT_PARAM *param,SEL_TREE *tree1,SEL_TREE *tree2)
   else if (!no_ranges1 && !no_ranges2 && !no_imerge_from_ranges)
   {
     /* Build the imerge part of the tree for the formula (1) */
-    SEL_TREE *rt1= tree1;
-    SEL_TREE *rt2= tree2;
-    if (no_merges1)
-      rt1= new SEL_TREE(tree1, TRUE, param);
-    if (no_merges2)
-      rt2= new SEL_TREE(tree2, TRUE, param);
     if (!rt1 || !rt2 ||
         result->merges.push_back(imerge_from_ranges) ||
         imerge_from_ranges->or_sel_tree(param, rt1) ||

Comment by Sergei Petrunia [ 2020-10-02 ]

Without this fix, there are two differences in .result files:
range.result has:

@@ -1274,7 +1274,7 @@
 5 <= a AND b = 3 OR
 3 <= a;
 id     select_type     table   type    possible_keys   key     key_len ref     rows    Extra
-1      SIMPLE  t1      range   a       a       5       NULL    3       Using where; Using index
+1      SIMPLE  t1      range   a       a       5       NULL    4       Using where; Using index
 SELECT * FROM t1 WHERE

This one seems ok. The optimizer passes a different set of ranges but it is valid.

The other one is not ok:

CURRENT_TEST: main.range_vs_index_merge_innodb
--- /home/psergey/dev-git/10.2/mysql-test/r/range_vs_index_merge_innodb.result  2020-10-01 22:46:56.624684375 +0300
+++ /home/psergey/dev-git/10.2/mysql-test/r/range_vs_index_merge_innodb.reject  2020-10-01 22:54:18.154504261 +0300
@@ -1796,16 +1796,11 @@
 WHERE ( state = 'Alabama' OR state >= 'Colorado' ) AND id != 9 
 OR ( capital >= 'Topeka' OR state = 'Kansas' ) AND state != 'Texas';
 id     select_type     table   type    possible_keys   key     key_len ref     rows    Extra
-1      SIMPLE  t1      range   state,capital   state   71      NULL    8       Using index condition; Using where
+1      SIMPLE  t1      index_merge     state,capital   capital,state   67,71   NULL    5       Using sort_union(capital,state); Using where
 SELECT * FROM t1 FORCE KEY (state,capital) 
 WHERE ( state = 'Alabama' OR state >= 'Colorado' ) AND id != 9 
 OR ( capital >= 'Topeka' OR state = 'Kansas' ) AND state != 'Texas';
 id     state   capital
-4      Florida Tallahassee
-3      Georgia Atlanta
-2      Hawaii  Honolulu
-6      Michigan        Lansing
-7      Pennsylvania    Harrisburg
 8      Virginia        Richmond
 DROP TABLE t1;
 #

Comment by Sergei Petrunia [ 2020-10-02 ]

Let's denote the patch fragment posted in this comment https://jira.mariadb.org/browse/MDEV-23811?focusedCommentId=167583&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-167583
as "The tree_or() fix".

I'm wondering why can't one have a "more local" version of The tree_or() fix, like so:

@@ -8899,11 +8902,12 @@ tree_or(RANGE_OPT_PARAM *param,SEL_TREE *tree1,SEL_TREE *tree2)
     {
       SEL_ARG *key1= tree1->keys[key_no];
       SEL_ARG *key2= tree2->keys[key_no];
-      if (!must_be_ored)
-      {
+      if (!must_be_ored || no_merges1)
         key1->incr_refs();
+
+      if (!must_be_ored || no_merges2)
         key2->incr_refs();
-      }
+
       if ((result->keys[key_no]= key_or(param, key1, key2)))
         result->keys_map.set_bit(key_no);
     }

this one will not copy the whole SEL_TREE objects but rather mark the SEL_ARG objects as shared before passing them to key_or().

But I still don't get why one should do that.

Comment by Sergei Petrunia [ 2020-10-02 ]

A bit more detail: the query

EXPLAIN
SELECT * FROM t1 FORCE KEY (state,capital) 
WHERE (state = 'Alabama' OR state >= 'Colorado') AND id != 9 
OR ( capital >= 'Topeka' OR state = 'Kansas' ) AND state != 'Texas';

calls tree_or() five (5) times.
The action of "The tree_or() fix" is essential in the first tree_or() call.
The first tree_or() call computes OR of SEL_TREEs representing {{state = 'Alabama' OR state >= 'Colorado' }}.
Here, the case is very simple - both SEL_TREEs use ranges on the just one, the same index.
It is not at all clear why one should call "incr_refs()" in this case.

Could it be that there is some code that is invoked after the tree_or() call. And that code works correctly when it is passed a SEL_TREE which is "shared" and works incorrectly when it is passed a SEL_TREE which is not shared?

Comment by Sergei Petrunia [ 2020-10-02 ]

Exploring this hypothesis further, I find a suspicious place in key_or.
It has this check

  if (key2->use_count) 

and this seems wrong to me: key2 here has use_count=0, but it not the root node of the SEL_ARG graph. SEL_ARG::use_count is valid only for the root nodes.
Elsewhere in this function, one can see that the code checks key2_shared which is the value of SEL_ARG::shared of a root node.

So I make a patch which checks key2_shared instead:

@@ -9597,10 +9600,12 @@ key_or(RANGE_OPT_PARAM *param, SEL_ARG *key1,SEL_ARG *key2)
 
       if (!tmp->next_key_part)
       {
-        if (key2->use_count)
+       SEL_ARG *key2_next= key2->next;
+        //if (key2->use_count)
+        if (key2_shared)
        {
          SEL_ARG *key2_cpy= new SEL_ARG(*key2);
-          if (key2_cpy)
+          if (!key2_cpy)
             return 0;
           key2= key2_cpy;
        }

The whole patch for this MDEV thus becomes : https://gist.github.com/spetrunia/3eb23e92f8ab545d3180b496ce6ad24d

and this fixes the query.

igor any thoughts?

Comment by Igor Babaev [ 2020-10-06 ]

Sergei,

I've accepted all your changes for my original fix.
This is a new patch for 10.2.

Comment by Sergei Petrunia [ 2020-10-06 ]

One more question: https://lists.launchpad.net/maria-developers/msg12398.html

Comment by Sergei Petrunia [ 2020-10-06 ]

Ok to push

Comment by Igor Babaev [ 2020-10-08 ]

A fix for this bug was pushed to 10.2

Generated at Thu Feb 08 09:25:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.