[MDEV-32646] Query planner doesn't use the full index for a query filtering on indexed columns Created: 2023-11-01  Updated: 2023-11-01

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.3.39, 11.1.2
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Ivan Krylov Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Debian Buster



 Description   

Originally discussed at MariaDB discuss.

For the following table:

CREATE TABLE `mol_trans` (
  `species_id` int(11) DEFAULT NULL,
  `wl_vac` double DEFAULT NULL,
  `upper_id` int(11) DEFAULT NULL,
  `lower_id` int(11) DEFAULT NULL,
  `prob` double DEFAULT NULL,
  `flag` tinyint(4) DEFAULT NULL, -- almost always 1
  KEY `spid_flag_wl` (`species_id`,`flag`,`wl_vac`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
 PARTITION BY LIST (`species_id`)
(PARTITION `CaO` VALUES IN (6115) ENGINE = InnoDB,
 PARTITION `CN3` VALUES IN (6121) ENGINE = InnoDB,
 PARTITION `CN2` VALUES IN (6119) ENGINE = InnoDB,
 PARTITION `AlO` VALUES IN (6109) ENGINE = InnoDB)

and the following query:

select
  mtr.prob,
  mtr.lower_id,
  mtr.upper_id
from
  mol_trans mtr
where (
  mtr.species_id=6115
  and mtr.wl_vac > 766.0
  and mtr.wl_vac < 883.0
  and mtr.flag = 1
)
order by mtr.wl_vac;

I get a query plan that doesn't fully use the spid_flag_wl index. On 11.1.2, it's not using an index at all:

           id: 1
  select_type: SIMPLE
        table: mtr
         type: ALL
possible_keys: spid_flag_wl
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 26559953
        Extra: Using where; Using filesort

and on 10.3.39, it's using a prefix of the index:

           id: 1
  select_type: SIMPLE
        table: mtr
         type: ref
possible_keys: spid_flag_wl
          key: spid_flag_wl
      key_len: 7
          ref: const,const
         rows: 14158123
        Extra: Using where

It's possible to make the query optimiser use the index with FORCE INDEX(spid_flag_wl), after which the query speeds up from 90 seconds to 11 seconds to select 3 million rows out of 40 million. No amount of ANALYZE TABLE lets the optimiser use the full index automatically. I can share the 2-gigabyte dump (the table is constructed from publicly-available ExoMol files) or perform additional experiments myself if needed.


Generated at Thu Feb 08 10:32:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.