[MDEV-28444] Wrong query plan using partition by range of date - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.7.1
Fix Version/s: None
Component/s: None
Labels:
None

Description

For a table partitioned by day with PK( date , point) to respect primary key ordering

CREATE TABLE `monde_gfs` (

  `date_heure` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',

  `num_point` int(11) NOT NULL DEFAULT 0,

  PRIMARY KEY (`date_heure`,`num_point`),

  KEY `num_point` (`num_point`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1

 PARTITION BY RANGE (to_days(`date_heure`))

(PARTITION `p2022_04_27` VALUES LESS THAN (738638) ENGINE = InnoDB,

 PARTITION `p2022_04_28` VALUES LESS THAN (738639) ENGINE = InnoDB,

 PARTITION `p2022_04_29` VALUES LESS THAN (738640) ENGINE = InnoDB

...

The optimizer wrongly estimate the cost of the range ref access 2 vs 24M where as the 2 partions contains around 50M rows

 analyze format=json SELECT * FROM  monde_gfs m0_ WHERE m0_.num_point = 237609 AND m0_.date_heure >=

 '2022-04-29 00:00:00' AND m0_.date_heure <= '2022-04-30 00:00:00';

 "query_block": {

  "select_id": 1,

  "r_loops": 1,

  "r_total_time_ms": 46326.7725,

  "table": {

   "table_name": "m0_",

   "partitions": ["p2022_04_29", "p2022_04_30"],

   "access_type": "range",

   "possible_keys": ["PRIMARY", "num_point"],

   "key": "PRIMARY",

   "key_length": "9",

   "used_key_parts": ["date_heure", "num_point"],

   "r_loops": 1,

   "rows": 2,

   "r_rows": 24917761,

   "filtered": 100,

   "r_filtered": 1.0033e-4,

   "attached_condition": "m0_.num_point = 237609 and m0_.date_heure >= '2022-04-29 00:00:00' and m0_.date_heure <= '2022-04-30 00:00:00'"

At the same time the partition pruning should already filtering the date_heure range and a score of 2 is under estimate and should be more estimated to the number records of each pruned partition because it's a range full index scan

As the result of this miss estimate the correct plan is ignored

analyze format=json SELECT * FROM  monde_gfs m0_ force index (num_point) WHERE m0_.num_point = 237609 AND m0_.date_heure >=

 '2022-04-29 00:00:00' AND m0_.date_heure <= '2022-04-30 00:00:00';

  "query_block": {

    "select_id": 1,

    "r_loops": 1,

    "r_total_time_ms": 48.77394453,

    "table": {

      "table_name": "m0_",

      "partitions": ["p2022_04_29", "p2022_04_30"],

      "access_type": "ref",

      "possible_keys": ["num_point"],

      "key": "num_point",

      "key_length": "4",

      "used_key_parts": ["num_point"],

      "ref": ["const"],

      "r_loops": 1,

      "rows": 2,

      "r_rows": 48,

      "r_table_time_ms": 48.6800073,

      "r_other_time_ms": 0.056539477,

      "filtered": 100,

      "r_filtered": 52.08333333,

      "attached_condition": "m0_.date_heure >= '2022-04-29 00:00:00' and m0_.date_heure <= '2022-04-30 00:00:00'"

} |

We have observed this regression on pre production while production in 10.1 not affected and also the issue looks like beeing trigger at some point hitting 100% of the CPU with such queries

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: VAROQUI Stephane

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2022-04-29 09:20

Updated:: 2022-04-29 09:47

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.