[MDEV-20519] Query plan regression with optimizer_use_condition_selectivity > 1 - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.4.7
Fix Version/s: 10.1.44, 10.2.30, 10.3.21, 10.4.11
Component/s: Optimizer
Labels:
- optimizer-use-condition-selectivity-slowdown

Description

The optimizer can choose very bad execution plans for certain queries with the default optimizer_use_condition_selectivity=4 in MariaDB 10.4. Setting optimizer_use_condition_selectivity=1 allows you to work around the problem.

Attachments

Issue Links

causes

MDEV-22535 TABLE::initialize_quick_structures() takes 0.5% in oltp_read_only

Closed

is blocked by

MDEV-20576 A new assertion added to check validity of calculated selectivity values fails

Closed

relates to

MDEV-20576 A new assertion added to check validity of calculated selectivity values fails

Closed

MDEV-20595 Assertion `0 < sel && sel <= 2.0' failed in table_cond_selectivity

Stalled

Activity

Ascending order - Click to sort in descending order

View 7 older comments

Varun Gupta (Inactive) added a comment - 2019-09-10 19:06

After adding the ASSERT to table_cond_selectivity, I see some tests failing in the main suite.

TEST CASE #1

CREATE TABLE t1 (a varchar(16), b int, PRIMARY KEY(a), KEY(b));

INSERT INTO t1 VALUES

  ('USAChinese',10), ('USAEnglish',20), ('USAFrench',30);

CREATE TABLE t2 (i int);

INSERT INTO t2 VALUES

  (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(1),(2),(3),(4);

ANALYZE TABLE t1, t2;

set use_stat_tables='preferably';

set optimizer_use_condition_selectivity=3;

EXPLAIN EXTENDED

SELECT * FROM t1, t2

  WHERE  a <> 'USARussian' AND b IS NULL;

Analysis

Debugging inside the function calculate_cond_selectivity_for_table

For Primary Key on column a (the condition is a <> 'USARussian')

(gdb) p keynr

$6 = 0

(gdb) p table->quick_rows[keynr]

$3 = 4

(gdb) p table_records

$4 = 3

(gdb) p quick_cond_selectivity

$5 = 1.3333333333333333

Inside this function we have:

table->cond_selectivity*= quick_cond_selectivity;

(gdb) p table->cond_selectivity

$22 = 1.3333333333333333

The estimate for range access for index(a) > total records in the table.
The estimates should always be capped to the total number of records in the table.
So to fix this it would be correct to just make the range optimizer not return
estimates that are greater than the records in a table.

For Index on column b

(gdb) p quick_cond_selectivity

$24 = 0.33333333333333331

(gdb) p table->quick_rows[keynr]

$26 = 1

(gdb) p keynr

$27 = 1

table->cond_selectivity*= quick_cond_selectivity;

(gdb) p table->cond_selectivity

$28 = 0.44444444444444442

So this is the total selectivity for the table t1

Now adding a breakpoint to table_cond_selectivity.

(gdb) where

#0  table_cond_selectivity (join=0x7fffe0008658, idx=0, s=0x7fffe00091f8, rem_tables=2) at /home/varunraiko/MariaDB/10.1/sql/sql_select.cc:7647

#1  0x0000555555a812dd in best_extension_by_limited_search (join=0x7fffe0008658, remaining_tables=3, idx=0, record_count=1, read_time=0,

                                                            search_depth=62, prune_level=1, use_cond_selectivity=3) at /home/varunraiko/MariaDB/10.1/sql/sql_select.cc:8008

Inside the function table_cond_selectivity we have

ref access on table t1 is with index b

(gdb) p pos->key

$30 = 1

(gdb) p table->quick_rows[key]

$34 = 1

(gdb) p table->stat_records()

$35 = 3

Then we discount the selectivty of b IS NULL from the total selectivity of the
table t1

sel /= (double)table->quick_rows[key] / (double) table->stat_records();

(gdb) p sel

$36 = 1.3333333333333333

So after discount the selectivity for (b IS NULL), we end up with sel=1.333
which is incorrect and this is why we hit the assert

Varun Gupta (Inactive) added a comment - 2019-09-10 19:06 After adding the ASSERT to table_cond_selectivity, I see some tests failing in the main suite. TEST CASE #1 CREATE TABLE t1 (a varchar (16), b int , PRIMARY KEY (a), KEY (b)); INSERT INTO t1 VALUES ( 'USAChinese' ,10), ( 'USAEnglish' ,20), ( 'USAFrench' ,30); CREATE TABLE t2 (i int ); INSERT INTO t2 VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(1),(2),(3),(4); ANALYZE TABLE t1, t2; set use_stat_tables= 'preferably' ; set optimizer_use_condition_selectivity=3; EXPLAIN EXTENDED SELECT * FROM t1, t2 WHERE a <> 'USARussian' AND b IS NULL ; Analysis Debugging inside the function calculate_cond_selectivity_for_table For Primary Key on column a (the condition is a <> 'USARussian') (gdb) p keynr $6 = 0 (gdb) p table->quick_rows[keynr] $3 = 4 (gdb) p table_records $4 = 3 (gdb) p quick_cond_selectivity $5 = 1.3333333333333333 Inside this function we have: table->cond_selectivity*= quick_cond_selectivity; (gdb) p table->cond_selectivity $22 = 1.3333333333333333 The estimate for range access for index(a) > total records in the table. The estimates should always be capped to the total number of records in the table. So to fix this it would be correct to just make the range optimizer not return estimates that are greater than the records in a table. For Index on column b (gdb) p quick_cond_selectivity $24 = 0.33333333333333331 (gdb) p table->quick_rows[keynr] $26 = 1 (gdb) p keynr $27 = 1 table->cond_selectivity*= quick_cond_selectivity; (gdb) p table->cond_selectivity $28 = 0.44444444444444442 So this is the total selectivity for the table t1 Now adding a breakpoint to table_cond_selectivity. (gdb) where #0 table_cond_selectivity (join=0x7fffe0008658, idx=0, s=0x7fffe00091f8, rem_tables=2) at /home/varunraiko/MariaDB/10.1/sql/sql_select.cc:7647 #1 0x0000555555a812dd in best_extension_by_limited_search (join=0x7fffe0008658, remaining_tables=3, idx=0, record_count=1, read_time=0, search_depth=62, prune_level=1, use_cond_selectivity=3) at /home/varunraiko/MariaDB/10.1/sql/sql_select.cc:8008 Inside the function table_cond_selectivity we have ref access on table t1 is with index b (gdb) p pos->key $30 = 1 (gdb) p table->quick_rows[key] $34 = 1 (gdb) p table->stat_records() $35 = 3 Then we discount the selectivty of b IS NULL from the total selectivity of the table t1 sel /= ( double )table->quick_rows[key] / ( double ) table->stat_records(); (gdb) p sel $36 = 1.3333333333333333 So after discount the selectivity for (b IS NULL), we end up with sel=1.333 which is incorrect and this is why we hit the assert

Varun Gupta (Inactive) added a comment - 2019-09-12 00:57

TEST CASE #2

CREATE TABLE t1 (

  a INT,

  b INT NOT NULL,

  c char(100),

  KEY (b, c),

  KEY (b, a, c)

)DEFAULT CHARSET = utf8;

INSERT INTO t1 VALUES

(1,  1, 1),

(2,  2, 2),

(3,  3, 3),

(4,  4, 4),

(5,  5, 5),

(6,  6, 6),

(7,  7, 7),

(8,  8, 8),

(9,  9, 9);

INSERT INTO t1 SELECT a + 10,  b, c FROM t1;

INSERT INTO t1 SELECT a + 20,  b, c FROM t1;

INSERT INTO t1 SELECT a + 40,  b, c FROM t1;

INSERT INTO t1 SELECT a + 80,  b, c FROM t1;

INSERT INTO t1 SELECT a + 160, b, c FROM t1;

INSERT INTO t1 SELECT a + 320, b, c FROM t1;

INSERT INTO t1 SELECT a + 640, b, c FROM t1;

INSERT INTO t1 SELECT a + 1280, b, c FROM t1 LIMIT 80;

set optimizer_use_condition_selectivity=2;

EXPLAIN SELECT a FROM t1 WHERE b = 1 ORDER BY c DESC LIMIT 9;

This is also crashes when I added the ASSERT for sel in (0,1]

Analysis

Here I will use the optimizer trace to demonstrate the problem. Now I will run
the query without the ASSERT so that I could see the estimates in the optimizer
trace

    "range_scan_alternatives": [

        "index": "b",

        "ranges": ["(1) <= (b) <= (1)"],

        "rowid_ordered": false,

        "using_mrr": false,

        "index_only": false,

        "rows": 226,

        "cost": 409.38,

        "chosen": false,

        "cause": "cost"

},

        "index": "b_2",

        "ranges": ["(1) <= (b) <= (1)"],

        "rowid_ordered": false,

        "using_mrr": false,

        "index_only": true,

        "rows": 222,

        "cost": 135.96,

        "chosen": true

],

  "chosen_range_access_summary": {

    "range_access_plan": {

      "type": "range_scan",

      "index": "b_2",

      "rows": 222,

      "ranges": ["(1) <= (b) <= (1)"]

},

    "rows_for_plan": 222,

    "cost_for_plan": 135.96,

    "chosen": true

So here we see we have range access on index * b* and b_2
The estimate of records to be read for index b is 226
The estimate of records to be read for index b_2 is 222

So selectivity for table t1 (b=1) is :

"selectivity_for_indexes": [

    "index_name": "b",

    "selectivity_from_index": 0.1834

],

(gdb) p table_records

$1 = 1232

So we used index b to calculate the selectivity for the table t1.

Now we go the best_access_path to pick the best access on table t1, here we pick to do ref access with index b_2

"considered_access_paths": [

  "access_type": "ref",

  "index": "b",

  "used_range_estimates": true,

  "rows": 226,

  "cost": 124,

  "chosen": true

},

  "access_type": "ref",

  "index": "b_2",

  "used_range_estimates": true,

  "rows": 222,

  "cost": 124.73,

  "chosen": true

},

Then in table_cond_selectivity we would discount the ref access selectvity for
index b_2 from the total_selectivity

selectivity from index b_2= 0.18019480

(gdb) p key

$7 = 1

(gdb) p table->quick_rows[key]

$8 = 222

(gdb) p table->stat_records()

$9 = 1232

(gdb) p (double)table->quick_rows[key] / (double) table->stat_records()

$6 = 0.18019480519480519

So here we do the discount of selectivity for index b_2

sel /= (double)table->quick_rows[key] / (double) table->stat_records();

Before discount total selectivity

(gdb) p sel

$10 = 0.18344155844155843

After discount total selectivity

(gdb) p sel

$11 = 1.0180180180180181

This is why we hit the assert.

Varun Gupta (Inactive) added a comment - 2019-09-12 00:57 TEST CASE #2 CREATE TABLE t1 ( a INT , b INT NOT NULL , c char (100), KEY (b, c), KEY (b, a, c) ) DEFAULT CHARSET = utf8; INSERT INTO t1 VALUES (1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4), (5, 5, 5), (6, 6, 6), (7, 7, 7), (8, 8, 8), (9, 9, 9); INSERT INTO t1 SELECT a + 10, b, c FROM t1; INSERT INTO t1 SELECT a + 20, b, c FROM t1; INSERT INTO t1 SELECT a + 40, b, c FROM t1; INSERT INTO t1 SELECT a + 80, b, c FROM t1; INSERT INTO t1 SELECT a + 160, b, c FROM t1; INSERT INTO t1 SELECT a + 320, b, c FROM t1; INSERT INTO t1 SELECT a + 640, b, c FROM t1; INSERT INTO t1 SELECT a + 1280, b, c FROM t1 LIMIT 80; set optimizer_use_condition_selectivity=2; EXPLAIN SELECT a FROM t1 WHERE b = 1 ORDER BY c DESC LIMIT 9; This is also crashes when I added the ASSERT for sel in (0,1] Analysis Here I will use the optimizer trace to demonstrate the problem. Now I will run the query without the ASSERT so that I could see the estimates in the optimizer trace { "range_scan_alternatives": [ { "index": "b", "ranges": ["(1) <= (b) <= (1)"], "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 226, "cost": 409.38, "chosen": false, "cause": "cost" }, { "index": "b_2", "ranges": ["(1) <= (b) <= (1)"], "rowid_ordered": false, "using_mrr": false, "index_only": true, "rows": 222, "cost": 135.96, "chosen": true } ], "chosen_range_access_summary": { "range_access_plan": { "type": "range_scan", "index": "b_2", "rows": 222, "ranges": ["(1) <= (b) <= (1)"] }, "rows_for_plan": 222, "cost_for_plan": 135.96, "chosen": true } } So here we see we have range access on index * b* and b_2 The estimate of records to be read for index b is 226 The estimate of records to be read for index b_2 is 222 So selectivity for table t1 (b=1) is : "selectivity_for_indexes": [ { "index_name": "b", "selectivity_from_index": 0.1834 } ], (gdb) p table_records $1 = 1232 So we used index b to calculate the selectivity for the table t1. Now we go the best_access_path to pick the best access on table t1, here we pick to do ref access with index b_2 "considered_access_paths": [ { "access_type": "ref", "index": "b", "used_range_estimates": true, "rows": 226, "cost": 124, "chosen": true }, { "access_type": "ref", "index": "b_2", "used_range_estimates": true, "rows": 222, "cost": 124.73, "chosen": true }, Then in table_cond_selectivity we would discount the ref access selectvity for index b_2 from the total_selectivity selectivity from index b_2= 0.18019480 (gdb) p key $7 = 1 (gdb) p table->quick_rows[key] $8 = 222 (gdb) p table->stat_records() $9 = 1232 (gdb) p ( double )table->quick_rows[key] / ( double ) table->stat_records() $6 = 0.18019480519480519 So here we do the discount of selectivity for index b_2 sel /= ( double )table->quick_rows[key] / ( double ) table->stat_records(); Before discount total selectivity (gdb) p sel $10 = 0.18344155844155843 After discount total selectivity (gdb) p sel $11 = 1.0180180180180181 This is why we hit the assert.

Martin Štěpař added a comment - 2019-10-08 12:13

Probably duplicates https://jira.mariadb.org/browse/MDEV-20424

Martin Štěpař added a comment - 2019-10-08 12:13 Probably duplicates https://jira.mariadb.org/browse/MDEV-20424

Varun Gupta (Inactive) added a comment - 2019-11-07 17:24

Pushed to 10.1 and removed the assert that was added in ~~MDEV-20576~~. That will be added again with MDEV-20595

Varun Gupta (Inactive) added a comment - 2019-11-07 17:24 Pushed to 10.1 and removed the assert that was added in MDEV-20576 . That will be added again with MDEV-20595

Geoff Montee (Inactive) added a comment - 2019-11-12 23:44

It appears that "Fix Version/s" is incorrect due to the effects of the re-release that occurred due to ~~MDEV-20987~~. The "Fix Version/s" field needs to be incremented by 1 for every release series, so it should be 10.1.44, 10.2.30, 10.3.21, 10.4.11.

Geoff Montee (Inactive) added a comment - 2019-11-12 23:44 It appears that "Fix Version/s" is incorrect due to the effects of the re-release that occurred due to MDEV-20987 . The "Fix Version/s" field needs to be incremented by 1 for every release series, so it should be 10.1.44, 10.2.30, 10.3.21, 10.4.11.

People

Assignee:: Varun Gupta (Inactive)

Reporter:: Geoff Montee (Inactive)

Votes:: 1 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 2019-09-06 20:19

Updated:: 2024-07-07 22:33

Resolved:: 2019-11-07 17:23

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Issue Links

Activity

TEST CASE #1

Analysis

TEST CASE #2

Analysis

People

Dates

Git Integration