[MDEV-25707] SELECT queries not using index after upgrade to 10.5.9 (from 10.3.21) Created: 2021-05-17  Updated: 2022-01-06

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.5.9
Fix Version/s: 10.5

Type: Bug Priority: Major
Reporter: Maria M Pflaum Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: eq_range_index_dive_limit, optimizer-hard
Environment:

RHEL8


Attachments: Zip Archive create_populate_mytable.sql.zip     HTML File explain_mytable     HTML File select_mytable    

 Description   

The following type of query is using where in EXPLAIN instead of index for some large quantity of OR's (i.e. on 10.5.9 uses WHERE, on 10.3.x uses INDEX also on 10.5.9 takes many minutes, on 10.3.x takes seconds)

query: SELECT col2 FROM mytable WHERE col1=a OR's col1=b ... with lots of OR's

If we force the query to use the index, "FORCE INDEX(mytable_col1_idx')", then the 10.5.9 query will finish in time similar to 10.3.x

See attached files to reproduce.



 Comments   
Comment by Sergei Petrunia [ 2021-05-31 ]

I am trying on current 10.5 (10.5.11) and I see this plan:

+------+-------------+---------+-------+------------------+------------------+---------+------+--------+-----------------------+
| id   | select_type | table   | type  | possible_keys    | key              | key_len | ref  | rows   | Extra                 |
+------+-------------+---------+-------+------------------+------------------+---------+------+--------+-----------------------+
|    1 | SIMPLE      | mytable | range | mytable_col1_idx | mytable_col1_idx | 3       | NULL | 101358 | Using index condition |
+------+-------------+---------+-------+------------------+------------------+---------+------+--------+-----------------------+

I get the same on 10.5.9.

Comment by Sergei Petrunia [ 2021-05-31 ]

Table def

 CREATE TABLE `mytable` (
  `col1` mediumint(9) NOT NULL DEFAULT 0,
  `col2` mediumint(9) NOT NULL DEFAULT 0,
  KEY `mytable_col1_idx` (`col1`),
  KEY `mytable_ col2_idx` (`col2`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

The table in the test dataset has 1.7M rows, the query is an OR of 5661 conditions in form col1=NNN OR ....

Comment by Sergei Petrunia [ 2021-05-31 ]

The number of range exceeds eq_range_index_dive_limit, so the estimate of #values in one range is taken from the index statistics. Could ANALYZE TABLE help here?

Comment by Sergei Petrunia [ 2021-05-31 ]

Ok, I was able to reproduce.

+------+-------------+---------+------+------------------+------+---------+------+---------+-------------+
| id   | select_type | table   | type | possible_keys    | key  | key_len | ref  | rows    | Extra       |
+------+-------------+---------+------+------------------+------+---------+------+---------+-------------+
|    1 | SIMPLE      | mytable | ALL  | mytable_col1_idx | NULL | NULL    | NULL | 1761063 | Using where |
+------+-------------+---------+------+------------------+------+---------+------+---------+-------------+

Relevant portion of Optimizer trace:

    {
      "rows_estimation": [
        {
          "table": "mytable",
          "range_analysis": {
            "table_scan": {
              "rows": 1761063,
              "cost": 355226.2292
            },
 
            "analyzing_range_alternatives": {
              "range_scan_alternatives": [
                {
                  "index": "mytable_col1_idx",
                  "ranges": [
+--5631 lines: "(28775) <= (col1) <= (28775)",---------
                  ],
                  "rowid_ordered": false,
                  "using_mrr": false,
                  "index_only": false,
                  "rows": 416694,
                  "cost": 506571.1092,
                  "chosen": false,
                  "cause": "cost"
                }

That is, it is not picking the range access because it thinks its cost is bigger than that of full table scan.

The number of rows it expects: 416694
This gives 74 rows per each lookup key.

Comment by Sergei Petrunia [ 2021-05-31 ]

Index statistics:

MariaDB [test1]> show keys from mytable\G
*************************** 1. row ***************************
        Table: mytable
   Non_unique: 1
     Key_name: mytable_col1_idx
 Seq_in_index: 1
  Column_name: col1
    Collation: A
  Cardinality: 23798
     Sub_part: NULL
       Packed: NULL
         Null: 
   Index_type: BTREE
      Comment: 
Index_comment: 

Comment by Sergei Petrunia [ 2021-05-31 ]

The estimate of 74 rows per lookup key agrees with the real dataset.
The Cardinality in the index statistics (23798) agrees with the actual cardinality value of 23765.

However the query itself will return only 107 rows.

Comment by Sergei Petrunia [ 2021-05-31 ]

Suggested workaround: set eq_range_index_dive_limit to a value that's higher than the number of elements in the IN-list (5641). Or, set it it to 0, which was the default before MariaDB 10.4.2.

Comment by Sergei Petrunia [ 2021-06-04 ]

Note: MySQL-8 suffers from the same issue (with innodb: except when one runs the query after loading the data. After one runs ANALYZE TABLE, the bad query plan is chosen). PostgreSQL picks a good query plan, but it will switch to poor query plan if the query has more "col1=const" disjuncts.

Comment by Sergei Petrunia [ 2021-06-09 ]

Notes from the optimizer call:
One option we could use is to estimate a few ranges with records_in_range calls, and then extrapolate the result to all ranges.
This won't be 100% bulletproof, though (One can imagine cases where the optimizer would still pick the wrong plan).
Also, this will add quite a bit of complexity. Do we really need to add it to handle this case? Could it be that this is a niche case that's not worth it?

Comment by Sergei Petrunia [ 2021-06-24 ]

The cause

This bug is a consequence of eq_range_index_dive_limit optimization (let's denote it as EQ-OPT).

The goal of EQ-OPT was to speed up the query optimization (right! optimization, not execution) by using index statistics estimates (this is table-wide average, cheap to use but imprecise), instead of using so-called "records-in-range estimates" (more precise but expensive to get) for queries that scan a lot of equality ranges.

Using imprecise estimates can cause the optimizer to pick a bad query plan. It happens when the reality is far from the estimate.
Your query is an example of this. On average, a single col1=value matches 70 rows. However, the query scans "rare" values, where 5641 values produce 107 matching rows in total.

The impact

The optimizer will use bad query plans for workloads that scan the "outliers" - values that are much more or much less common than others.

The workaround

Setting the eq_range_index_dive_limit will cause the optimizer to spend more time but construct better query plans.

Generated at Thu Feb 08 09:39:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.