[MDEV-18355] Switching the results for plans when there was query plan change with the new optimizer defaults Created: 2019-01-23  Updated: 2021-06-28

Status: Stalled
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: None

Type: Task Priority: Major
Reporter: Varun Gupta (Inactive) Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File results.diff    
Issue Links:
Relates
relates to MDEV-15253 Default optimizer setting changes for... Closed

 Description   

The tests that need to be considered are

subselect_sj_mat.test
subselect_sj2_mat.test
subselect_no_semijoin.test
subselect_no_scache.test
subselect_no_opts.test
subselect_no_exists_to_in.test
subselect.test
stat_tables_innodb.test
stat_tables.test
range_mrr_icp.test
range.test
join_outer_innodb.test
join_cache.test
invisible_field.test
group_by.test
derived_cond_pushdown.test



 Comments   
Comment by Varun Gupta (Inactive) [ 2019-01-23 ]

CREATE TABLE t1 (a INT, KEY(a)) ENGINE=MyISAM;
INSERT INTO t1 VALUES (8),(0);
CREATE TABLE t2 (b INT, c VARCHAR(1)) ENGINE=MyISAM;
INSERT INTO t2 VALUES (4,'j'),(6,'v');
CREATE TABLE t3 (d VARCHAR(1)) ENGINE=MyISAM;
INSERT INTO t3 VALUES ('b'),('c');

MariaDB [test]> analyze
    -> SELECT * FROM t1
    -> WHERE a = (SELECT MAX(b) FROM t2 WHERE c IN (SELECT MAX(d) FROM t3)) OR a = 10;
+------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+
| id   | select_type  | table       | type   | possible_keys | key          | key_len | ref       | rows | r_rows | filtered | r_filtered | Extra                    |
+------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+
|    1 | PRIMARY      | t1          | index  | a             | a            | 5       | NULL      |    2 |   2.00 |   100.00 |       0.00 | Using where; Using index |
|    2 | SUBQUERY     | t2          | ALL    | NULL          | NULL         | NULL    | NULL      |    2 |   2.00 |   100.00 |     100.00 | Using where              |
|    2 | SUBQUERY     | <subquery3> | eq_ref | distinct_key  | distinct_key | 4       | test.t2.c |    1 |   0.00 |   100.00 |     100.00 |                          |
|    3 | MATERIALIZED | t3          | ALL    | NULL          | NULL         | NULL    | NULL      |    2 |   2.00 |   100.00 |     100.00 |                          |
+------+--------------+-------------+--------+---------------+--------------+---------+-----------+------+--------+----------+------------+--------------------------+

MariaDB [test]> set optimizer_use_condition_selectivity=1;
Query OK, 0 rows affected (0.000 sec)

MariaDB [test]> analyze SELECT * FROM t1 WHERE a = (SELECT MAX(b) FROM t2 WHERE c IN (SELECT MAX(d) FROM t3)) OR a = 10;
+------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+
| id   | select_type  | table       | type  | possible_keys | key  | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra                                           |
+------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+
|    1 | PRIMARY      | t1          | index | a             | a    | 5       | NULL |    2 |   2.00 |   100.00 |       0.00 | Using where; Using index                        |
|    2 | SUBQUERY     | <subquery3> | ALL   | distinct_key  | NULL | NULL    | NULL |    1 |   1.00 |   100.00 |     100.00 |                                                 |
|    2 | SUBQUERY     | t2          | ALL   | NULL          | NULL | NULL    | NULL |    2 |   2.00 |   100.00 |       0.00 | Using where; Using join buffer (flat, BNL join) |
|    3 | MATERIALIZED | t3          | ALL   | NULL          | NULL | NULL    | NULL |    2 |   2.00 |   100.00 |     100.00 |                                                 |
+------+--------------+-------------+-------+---------------+------+---------+------+------+--------+----------+------------+-------------------------------------------------+

Comment by Varun Gupta (Inactive) [ 2019-01-23 ]

The difference for the two cases is:

Case 1: optimizer_use_condition_selectivity=4;

                      "plan_prefix": [],
                      "table": "<subquery3>",
                      "best_access_path": {
                        "considered_access_paths": [
                          {
                            "access_type": "scan",
                            "resulting_rows": 2,
                            "cost": 4.2105,
                            "chosen": true
                          }
                        ]
                      },

resulting rows=2 in the case when we check the cost for the plan <subquery3>,t2

Case 2: optimizer_use_condition_selectivity=1;

                    {
                      "plan_prefix": [],
                      "table": "<subquery3>",
                      "best_access_path": {
                        "considered_access_paths": [
                          {
                            "access_type": "scan",
                            "resulting_rows": 1,
                            "cost": 4.4105,
                            "chosen": true
                          }
                        ]
                      },

resulting rows= 1 in this case for the plan <subquery3>,t2

Comment by Sergei Petrunia [ 2019-01-24 ]

The above observation is interesting (and should probably be investigated as a separate MDEV- why do we get 2 as an estimate for #rows in a query with implicit grouping?)

But let's get back to the main question:

Did we lose some test coverage because the tests are now executed with different defaults

The testcase above is a testcase for MDEV-435. Looking at the investigation portion of that MDEV, I see that

  • the query plan in the subquery is irrelevant.
  • most likely, the testcase became irrelevant when EXPLAIN FORMAT=JSON/ANALYZE code changed the way EXPLAIN output is produced.
Comment by Sergei Petrunia [ 2019-01-25 ]

varun, thanks for the attached file.
Note that it can be further reduced: I spotted differences like

@@ -1672,7 +1672,7 @@ WHERE City.Country=Country.Code AND
 Country.Name LIKE 'L%' AND City.Population > 100000;
 id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
 1	SIMPLE	Country	range	PRIMARY,Name	Name	52	NULL	10	Using index condition; Rowid-ordered scan
-1	SIMPLE	City	ref	Population,Country	Country	3	world.Country.Code	18	Using where; Using join buffer (flat, BKAH join); Key-ordered Rowid-ordered scan
+1	SIMPLE	City	ref	Population,Country	Country	3	world.Country.Code	17	Using where; Using join buffer (flat, BKAH join); Key-ordered Rowid-ordered scan

where the only difference is #rows.

Comment by Varun Gupta (Inactive) [ 2019-02-08 ]

Patch
http://lists.askmonty.org/pipermail/commits/2019-February/013385.html

Generated at Thu Feb 08 08:43:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.