Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18355

Switching the results for plans when there was query plan change with the new optimizer defaults

Details

    • Task
    • Status: Stalled (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • Optimizer
    • None

    Description

      The tests that need to be considered are

      subselect_sj_mat.test
      subselect_sj2_mat.test
      subselect_no_semijoin.test
      subselect_no_scache.test
      subselect_no_opts.test
      subselect_no_exists_to_in.test
      subselect.test
      stat_tables_innodb.test
      stat_tables.test
      range_mrr_icp.test
      range.test
      join_outer_innodb.test
      join_cache.test
      invisible_field.test
      group_by.test
      derived_cond_pushdown.test
      

      Attachments

        Issue Links

          Activity

            varun Varun Gupta (Inactive) added a comment - - edited

            CREATE TABLE t1 (a INT, KEY(a)) ENGINE=MyISAM;
            INSERT INTO t1 VALUES (8),(0);
            CREATE TABLE t2 (b INT, c VARCHAR(1)) ENGINE=MyISAM;
            INSERT INTO t2 VALUES (4,'j'),(6,'v');
            CREATE TABLE t3 (d VARCHAR(1)) ENGINE=MyISAM;
            INSERT INTO t3 VALUES ('b'),('c');
            

            MariaDB [test]> analyze
                -> SELECT * FROM t1
                -> WHERE a = (SELECT MAX(b) FROM t2 WHERE c IN (SELECT MAX(d) FROM t3)) OR a = 10;
            +------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+
            | id   | select_type  | table       | type   | possible_keys | key          | key_len | ref       | rows | r_rows | filtered | r_filtered | Extra                    |
            +------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+
            |    1 | PRIMARY      | t1          | index  | a             | a            | 5       | NULL      |    2 |   2.00 |   100.00 |       0.00 | Using where; Using index |
            |    2 | SUBQUERY     | t2          | ALL    | NULL          | NULL         | NULL    | NULL      |    2 |   2.00 |   100.00 |     100.00 | Using where              |
            |    2 | SUBQUERY     | <subquery3> | eq_ref | distinct_key  | distinct_key | 4       | test.t2.c |    1 |   0.00 |   100.00 |     100.00 |                          |
            |    3 | MATERIALIZED | t3          | ALL    | NULL          | NULL         | NULL    | NULL      |    2 |   2.00 |   100.00 |     100.00 |                          |
            +------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+
            

            MariaDB [test]> set optimizer_use_condition_selectivity=1;
            Query OK, 0 rows affected (0.000 sec)

            MariaDB [test]> analyze SELECT * FROM t1 WHERE a = (SELECT MAX(b) FROM t2 WHERE c IN (SELECT MAX(d) FROM t3)) OR a = 10;
            +------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+
            | id   | select_type  | table       | type  | possible_keys | key  | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra                                           |
            +------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+
            |    1 | PRIMARY      | t1          | index | a             | a    | 5       | NULL |    2 |   2.00 |   100.00 |       0.00 | Using where; Using index                        |
            |    2 | SUBQUERY     | <subquery3> | ALL   | distinct_key  | NULL | NULL    | NULL |    1 |   1.00 |   100.00 |     100.00 |                                                 |
            |    2 | SUBQUERY     | t2          | ALL   | NULL          | NULL | NULL    | NULL |    2 |   2.00 |   100.00 |       0.00 | Using where; Using join buffer (flat, BNL join) |
            |    3 | MATERIALIZED | t3          | ALL   | NULL          | NULL | NULL    | NULL |    2 |   2.00 |   100.00 |     100.00 |                                                 |
            +------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+
            

            varun Varun Gupta (Inactive) added a comment - - edited CREATE TABLE t1 (a INT , KEY (a)) ENGINE=MyISAM; INSERT INTO t1 VALUES (8),(0); CREATE TABLE t2 (b INT , c VARCHAR (1)) ENGINE=MyISAM; INSERT INTO t2 VALUES (4, 'j' ),(6, 'v' ); CREATE TABLE t3 (d VARCHAR (1)) ENGINE=MyISAM; INSERT INTO t3 VALUES ( 'b' ),( 'c' ); MariaDB [test]> analyze -> SELECT * FROM t1 -> WHERE a = (SELECT MAX(b) FROM t2 WHERE c IN (SELECT MAX(d) FROM t3)) OR a = 10; +------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+ | 1 | PRIMARY | t1 | index | a | a | 5 | NULL | 2 | 2.00 | 100.00 | 0.00 | Using where; Using index | | 2 | SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where | | 2 | SUBQUERY | <subquery3> | eq_ref | distinct_key | distinct_key | 4 | test.t2.c | 1 | 0.00 | 100.00 | 100.00 | | | 3 | MATERIALIZED | t3 | ALL | NULL | NULL | NULL | NULL | 2 | 2.00 | 100.00 | 100.00 | | +------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+ MariaDB [test] > set optimizer_use_condition_selectivity=1; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> analyze SELECT * FROM t1 WHERE a = (SELECT MAX(b) FROM t2 WHERE c IN (SELECT MAX(d) FROM t3)) OR a = 10; +------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+ | 1 | PRIMARY | t1 | index | a | a | 5 | NULL | 2 | 2.00 | 100.00 | 0.00 | Using where; Using index | | 2 | SUBQUERY | <subquery3> | ALL | distinct_key | NULL | NULL | NULL | 1 | 1.00 | 100.00 | 100.00 | | | 2 | SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 2 | 2.00 | 100.00 | 0.00 | Using where; Using join buffer (flat, BNL join) | | 3 | MATERIALIZED | t3 | ALL | NULL | NULL | NULL | NULL | 2 | 2.00 | 100.00 | 100.00 | | +------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+

            The difference for the two cases is:

            Case 1: optimizer_use_condition_selectivity=4;

                                  "plan_prefix": [],
                                  "table": "<subquery3>",
                                  "best_access_path": {
                                    "considered_access_paths": [
                                      {
                                        "access_type": "scan",
                                        "resulting_rows": 2,
                                        "cost": 4.2105,
                                        "chosen": true
                                      }
                                    ]
                                  },
            

            resulting rows=2 in the case when we check the cost for the plan <subquery3>,t2

            Case 2: optimizer_use_condition_selectivity=1;

                                {
                                  "plan_prefix": [],
                                  "table": "<subquery3>",
                                  "best_access_path": {
                                    "considered_access_paths": [
                                      {
                                        "access_type": "scan",
                                        "resulting_rows": 1,
                                        "cost": 4.4105,
                                        "chosen": true
                                      }
                                    ]
                                  },
            

            resulting rows= 1 in this case for the plan <subquery3>,t2

            varun Varun Gupta (Inactive) added a comment - The difference for the two cases is: Case 1: optimizer_use_condition_selectivity=4; "plan_prefix": [], "table": "<subquery3>", "best_access_path": { "considered_access_paths": [ { "access_type": "scan", "resulting_rows": 2, "cost": 4.2105, "chosen": true } ] }, resulting rows=2 in the case when we check the cost for the plan <subquery3>,t2 Case 2: optimizer_use_condition_selectivity=1; { "plan_prefix": [], "table": "<subquery3>", "best_access_path": { "considered_access_paths": [ { "access_type": "scan", "resulting_rows": 1, "cost": 4.4105, "chosen": true } ] }, resulting rows= 1 in this case for the plan <subquery3>,t2

            The above observation is interesting (and should probably be investigated as a separate MDEV- why do we get 2 as an estimate for #rows in a query with implicit grouping?)

            But let's get back to the main question:

            Did we lose some test coverage because the tests are now executed with different defaults

            The testcase above is a testcase for MDEV-435. Looking at the investigation portion of that MDEV, I see that

            • the query plan in the subquery is irrelevant.
            • most likely, the testcase became irrelevant when EXPLAIN FORMAT=JSON/ANALYZE code changed the way EXPLAIN output is produced.
            psergei Sergei Petrunia added a comment - The above observation is interesting (and should probably be investigated as a separate MDEV- why do we get 2 as an estimate for #rows in a query with implicit grouping?) But let's get back to the main question: Did we lose some test coverage because the tests are now executed with different defaults The testcase above is a testcase for MDEV-435 . Looking at the investigation portion of that MDEV, I see that the query plan in the subquery is irrelevant. most likely, the testcase became irrelevant when EXPLAIN FORMAT=JSON/ANALYZE code changed the way EXPLAIN output is produced.
            psergei Sergei Petrunia added a comment - - edited

            varun, thanks for the attached file.
            Note that it can be further reduced: I spotted differences like

            @@ -1672,7 +1672,7 @@ WHERE City.Country=Country.Code AND
             Country.Name LIKE 'L%' AND City.Population > 100000;
             id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
             1	SIMPLE	Country	range	PRIMARY,Name	Name	52	NULL	10	Using index condition; Rowid-ordered scan
            -1	SIMPLE	City	ref	Population,Country	Country	3	world.Country.Code	18	Using where; Using join buffer (flat, BKAH join); Key-ordered Rowid-ordered scan
            +1	SIMPLE	City	ref	Population,Country	Country	3	world.Country.Code	17	Using where; Using join buffer (flat, BKAH join); Key-ordered Rowid-ordered scan
            

            where the only difference is #rows.

            psergei Sergei Petrunia added a comment - - edited varun , thanks for the attached file. Note that it can be further reduced: I spotted differences like @@ -1672,7 +1672,7 @@ WHERE City.Country=Country.Code AND Country.Name LIKE 'L%' AND City.Population > 100000; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Rowid-ordered scan -1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer (flat, BKAH join); Key-ordered Rowid-ordered scan +1 SIMPLE City ref Population,Country Country 3 world.Country.Code 17 Using where; Using join buffer (flat, BKAH join); Key-ordered Rowid-ordered scan where the only difference is #rows.
            varun Varun Gupta (Inactive) added a comment - Patch http://lists.askmonty.org/pipermail/commits/2019-February/013385.html

            People

              psergei Sergei Petrunia
              varun Varun Gupta (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.