Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20519

Query plan regression with optimizer_use_condition_selectivity > 1

Details

    Description

      The optimizer can choose very bad execution plans for certain queries with the default optimizer_use_condition_selectivity=4 in MariaDB 10.4. Setting optimizer_use_condition_selectivity=1 allows you to work around the problem.

      Attachments

        Issue Links

          Activity

            After adding the ASSERT to table_cond_selectivity, I see some tests failing in the main suite.

            TEST CASE #1

            CREATE TABLE t1 (a varchar(16), b int, PRIMARY KEY(a), KEY(b));
            INSERT INTO t1 VALUES
              ('USAChinese',10), ('USAEnglish',20), ('USAFrench',30);
             
            CREATE TABLE t2 (i int);
            INSERT INTO t2 VALUES
              (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(1),(2),(3),(4);
            ANALYZE TABLE t1, t2;
             
            set use_stat_tables='preferably';
            set optimizer_use_condition_selectivity=3;
             
            EXPLAIN EXTENDED
            SELECT * FROM t1, t2
              WHERE  a <> 'USARussian' AND b IS NULL;
            

            Analysis

            Debugging inside the function calculate_cond_selectivity_for_table

            For Primary Key on column a (the condition is a <> 'USARussian')

            (gdb) p keynr
            $6 = 0
            (gdb) p table->quick_rows[keynr]
            $3 = 4
            (gdb) p table_records
            $4 = 3
            (gdb) p quick_cond_selectivity
            $5 = 1.3333333333333333
            

            Inside this function we have:

            table->cond_selectivity*= quick_cond_selectivity;
            

            (gdb) p table->cond_selectivity
            $22 = 1.3333333333333333
            

            The estimate for range access for index(a) > total records in the table.
            The estimates should always be capped to the total number of records in the table.
            So to fix this it would be correct to just make the range optimizer not return
            estimates that are greater than the records in a table.

            For Index on column b

            (gdb) p quick_cond_selectivity
            $24 = 0.33333333333333331
            (gdb) p table->quick_rows[keynr]
            $26 = 1
            (gdb) p keynr
            $27 = 1
            

            table->cond_selectivity*= quick_cond_selectivity;
            

            (gdb) p table->cond_selectivity
            $28 = 0.44444444444444442
            

            So this is the total selectivity for the table t1

            Now adding a breakpoint to table_cond_selectivity.

            (gdb) where
            #0  table_cond_selectivity (join=0x7fffe0008658, idx=0, s=0x7fffe00091f8, rem_tables=2) at /home/varunraiko/MariaDB/10.1/sql/sql_select.cc:7647
            #1  0x0000555555a812dd in best_extension_by_limited_search (join=0x7fffe0008658, remaining_tables=3, idx=0, record_count=1, read_time=0, 
                                                                        search_depth=62, prune_level=1, use_cond_selectivity=3) at /home/varunraiko/MariaDB/10.1/sql/sql_select.cc:8008
            

            Inside the function table_cond_selectivity we have

            ref access on table t1 is with index b

            (gdb) p pos->key
            $30 = 1
            (gdb) p table->quick_rows[key]
            $34 = 1
            (gdb) p table->stat_records()
            $35 = 3
            

            Then we discount the selectivty of b IS NULL from the total selectivity of the
            table t1

            sel /= (double)table->quick_rows[key] / (double) table->stat_records();
            

            (gdb) p sel
            $36 = 1.3333333333333333
            

            So after discount the selectivity for (b IS NULL), we end up with sel=1.333
            which is incorrect and this is why we hit the assert

            varun Varun Gupta (Inactive) added a comment - After adding the ASSERT to table_cond_selectivity, I see some tests failing in the main suite. TEST CASE #1 CREATE TABLE t1 (a varchar (16), b int , PRIMARY KEY (a), KEY (b)); INSERT INTO t1 VALUES ( 'USAChinese' ,10), ( 'USAEnglish' ,20), ( 'USAFrench' ,30);   CREATE TABLE t2 (i int ); INSERT INTO t2 VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(1),(2),(3),(4); ANALYZE TABLE t1, t2;   set use_stat_tables= 'preferably' ; set optimizer_use_condition_selectivity=3;   EXPLAIN EXTENDED SELECT * FROM t1, t2 WHERE a <> 'USARussian' AND b IS NULL ; Analysis Debugging inside the function calculate_cond_selectivity_for_table For Primary Key on column a (the condition is a <> 'USARussian') (gdb) p keynr $6 = 0 (gdb) p table->quick_rows[keynr] $3 = 4 (gdb) p table_records $4 = 3 (gdb) p quick_cond_selectivity $5 = 1.3333333333333333 Inside this function we have: table->cond_selectivity*= quick_cond_selectivity; (gdb) p table->cond_selectivity $22 = 1.3333333333333333 The estimate for range access for index(a) > total records in the table. The estimates should always be capped to the total number of records in the table. So to fix this it would be correct to just make the range optimizer not return estimates that are greater than the records in a table. For Index on column b (gdb) p quick_cond_selectivity $24 = 0.33333333333333331 (gdb) p table->quick_rows[keynr] $26 = 1 (gdb) p keynr $27 = 1 table->cond_selectivity*= quick_cond_selectivity; (gdb) p table->cond_selectivity $28 = 0.44444444444444442 So this is the total selectivity for the table t1 Now adding a breakpoint to table_cond_selectivity. (gdb) where #0 table_cond_selectivity (join=0x7fffe0008658, idx=0, s=0x7fffe00091f8, rem_tables=2) at /home/varunraiko/MariaDB/10.1/sql/sql_select.cc:7647 #1 0x0000555555a812dd in best_extension_by_limited_search (join=0x7fffe0008658, remaining_tables=3, idx=0, record_count=1, read_time=0, search_depth=62, prune_level=1, use_cond_selectivity=3) at /home/varunraiko/MariaDB/10.1/sql/sql_select.cc:8008 Inside the function table_cond_selectivity we have ref access on table t1 is with index b (gdb) p pos->key $30 = 1 (gdb) p table->quick_rows[key] $34 = 1 (gdb) p table->stat_records() $35 = 3 Then we discount the selectivty of b IS NULL from the total selectivity of the table t1 sel /= ( double )table->quick_rows[key] / ( double ) table->stat_records(); (gdb) p sel $36 = 1.3333333333333333 So after discount the selectivity for (b IS NULL), we end up with sel=1.333 which is incorrect and this is why we hit the assert

            TEST CASE #2

            CREATE TABLE t1 (
              a INT,
              b INT NOT NULL,
              c char(100),
              KEY (b, c),
              KEY (b, a, c)
            )DEFAULT CHARSET = utf8;
             
            INSERT INTO t1 VALUES
            (1,  1, 1),
            (2,  2, 2),
            (3,  3, 3),
            (4,  4, 4),
            (5,  5, 5),
            (6,  6, 6),
            (7,  7, 7),
            (8,  8, 8),
            (9,  9, 9);
             
            INSERT INTO t1 SELECT a + 10,  b, c FROM t1;
            INSERT INTO t1 SELECT a + 20,  b, c FROM t1;
            INSERT INTO t1 SELECT a + 40,  b, c FROM t1;
            INSERT INTO t1 SELECT a + 80,  b, c FROM t1;
            INSERT INTO t1 SELECT a + 160, b, c FROM t1;
            INSERT INTO t1 SELECT a + 320, b, c FROM t1;
            INSERT INTO t1 SELECT a + 640, b, c FROM t1;
            INSERT INTO t1 SELECT a + 1280, b, c FROM t1 LIMIT 80;
             
            set optimizer_use_condition_selectivity=2;
            EXPLAIN SELECT a FROM t1 WHERE b = 1 ORDER BY c DESC LIMIT 9;
            

            This is also crashes when I added the ASSERT for sel in (0,1]

            Analysis

            Here I will use the optimizer trace to demonstrate the problem. Now I will run
            the query without the ASSERT so that I could see the estimates in the optimizer
            trace

               {
                "range_scan_alternatives": [
                  {
                    "index": "b",
                    "ranges": ["(1) <= (b) <= (1)"],
                    "rowid_ordered": false,
                    "using_mrr": false,
                    "index_only": false,
                    "rows": 226,
                    "cost": 409.38,
                    "chosen": false,
                    "cause": "cost"
                  },
                  {
                    "index": "b_2",
                    "ranges": ["(1) <= (b) <= (1)"],
                    "rowid_ordered": false,
                    "using_mrr": false,
                    "index_only": true,
                    "rows": 222,
                    "cost": 135.96,
                    "chosen": true
                  }
                ],
              "chosen_range_access_summary": {
                "range_access_plan": {
                  "type": "range_scan",
                  "index": "b_2",
                  "rows": 222,
                  "ranges": ["(1) <= (b) <= (1)"]
                },
                "rows_for_plan": 222,
                "cost_for_plan": 135.96,
                "chosen": true
              }
            }
            

            So here we see we have range access on index * b* and b_2
            The estimate of records to be read for index b is 226
            The estimate of records to be read for index b_2 is 222

            So selectivity for table t1 (b=1) is :

            "selectivity_for_indexes": [
              {
                "index_name": "b",
                "selectivity_from_index": 0.1834
              }
            ],
            

            (gdb) p table_records
            $1 = 1232
            

            So we used index b to calculate the selectivity for the table t1.

            Now we go the best_access_path to pick the best access on table t1, here we pick to do ref access with index b_2

            "considered_access_paths": [
            {
              "access_type": "ref",
              "index": "b",
              "used_range_estimates": true,
              "rows": 226,
              "cost": 124,
              "chosen": true
            },
            {
              "access_type": "ref",
              "index": "b_2",
              "used_range_estimates": true,
              "rows": 222,
              "cost": 124.73,
              "chosen": true
            },
            

            Then in table_cond_selectivity we would discount the ref access selectvity for
            index b_2 from the total_selectivity

            selectivity from index b_2= 0.18019480

            (gdb) p key
            $7 = 1
            (gdb) p table->quick_rows[key]
            $8 = 222
            (gdb) p table->stat_records()
            $9 = 1232
            (gdb) p (double)table->quick_rows[key] / (double) table->stat_records()
            $6 = 0.18019480519480519
            

            So here we do the discount of selectivity for index b_2

            sel /= (double)table->quick_rows[key] / (double) table->stat_records();
            

            Before discount total selectivity

            (gdb) p sel
            $10 = 0.18344155844155843
            

            After discount total selectivity

            (gdb) p sel
            $11 = 1.0180180180180181
            

            This is why we hit the assert.

            varun Varun Gupta (Inactive) added a comment - TEST CASE #2 CREATE TABLE t1 ( a INT , b INT NOT NULL , c char (100), KEY (b, c), KEY (b, a, c) ) DEFAULT CHARSET = utf8;   INSERT INTO t1 VALUES (1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4), (5, 5, 5), (6, 6, 6), (7, 7, 7), (8, 8, 8), (9, 9, 9);   INSERT INTO t1 SELECT a + 10, b, c FROM t1; INSERT INTO t1 SELECT a + 20, b, c FROM t1; INSERT INTO t1 SELECT a + 40, b, c FROM t1; INSERT INTO t1 SELECT a + 80, b, c FROM t1; INSERT INTO t1 SELECT a + 160, b, c FROM t1; INSERT INTO t1 SELECT a + 320, b, c FROM t1; INSERT INTO t1 SELECT a + 640, b, c FROM t1; INSERT INTO t1 SELECT a + 1280, b, c FROM t1 LIMIT 80;   set optimizer_use_condition_selectivity=2; EXPLAIN SELECT a FROM t1 WHERE b = 1 ORDER BY c DESC LIMIT 9; This is also crashes when I added the ASSERT for sel in (0,1] Analysis Here I will use the optimizer trace to demonstrate the problem. Now I will run the query without the ASSERT so that I could see the estimates in the optimizer trace { "range_scan_alternatives": [ { "index": "b", "ranges": ["(1) <= (b) <= (1)"], "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 226, "cost": 409.38, "chosen": false, "cause": "cost" }, { "index": "b_2", "ranges": ["(1) <= (b) <= (1)"], "rowid_ordered": false, "using_mrr": false, "index_only": true, "rows": 222, "cost": 135.96, "chosen": true } ], "chosen_range_access_summary": { "range_access_plan": { "type": "range_scan", "index": "b_2", "rows": 222, "ranges": ["(1) <= (b) <= (1)"] }, "rows_for_plan": 222, "cost_for_plan": 135.96, "chosen": true } } So here we see we have range access on index * b* and b_2 The estimate of records to be read for index b is 226 The estimate of records to be read for index b_2 is 222 So selectivity for table t1 (b=1) is : "selectivity_for_indexes": [ { "index_name": "b", "selectivity_from_index": 0.1834 } ], (gdb) p table_records $1 = 1232 So we used index b to calculate the selectivity for the table t1. Now we go the best_access_path to pick the best access on table t1, here we pick to do ref access with index b_2 "considered_access_paths": [ { "access_type": "ref", "index": "b", "used_range_estimates": true, "rows": 226, "cost": 124, "chosen": true }, { "access_type": "ref", "index": "b_2", "used_range_estimates": true, "rows": 222, "cost": 124.73, "chosen": true }, Then in table_cond_selectivity we would discount the ref access selectvity for index b_2 from the total_selectivity selectivity from index b_2= 0.18019480 (gdb) p key $7 = 1 (gdb) p table->quick_rows[key] $8 = 222 (gdb) p table->stat_records() $9 = 1232 (gdb) p ( double )table->quick_rows[key] / ( double ) table->stat_records() $6 = 0.18019480519480519 So here we do the discount of selectivity for index b_2 sel /= ( double )table->quick_rows[key] / ( double ) table->stat_records(); Before discount total selectivity (gdb) p sel $10 = 0.18344155844155843 After discount total selectivity (gdb) p sel $11 = 1.0180180180180181 This is why we hit the assert.
            martin.stepar Martin Štěpař added a comment - Probably duplicates https://jira.mariadb.org/browse/MDEV-20424

            Pushed to 10.1 and removed the assert that was added in MDEV-20576. That will be added again with MDEV-20595

            varun Varun Gupta (Inactive) added a comment - Pushed to 10.1 and removed the assert that was added in MDEV-20576 . That will be added again with MDEV-20595

            It appears that "Fix Version/s" is incorrect due to the effects of the re-release that occurred due to MDEV-20987. The "Fix Version/s" field needs to be incremented by 1 for every release series, so it should be 10.1.44, 10.2.30, 10.3.21, 10.4.11.

            GeoffMontee Geoff Montee (Inactive) added a comment - It appears that "Fix Version/s" is incorrect due to the effects of the re-release that occurred due to MDEV-20987 . The "Fix Version/s" field needs to be incremented by 1 for every release series, so it should be 10.1.44, 10.2.30, 10.3.21, 10.4.11.

            People

              varun Varun Gupta (Inactive)
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.