[MDEV-17929] Optimizer trace support for different parts of the server Created: 2018-12-07  Updated: 2019-02-13  Resolved: 2019-02-13

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: 10.4.3

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Varun Gupta (Inactive)
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
PartOf
is part of MDEV-6111 optimizer trace Closed

 Description   

This task is for checking how optimizer trace uses different parts of the server.

Done:

  • Condition transformation (DONE)
  • KEYUSE objects (DONE)
  • Range access (DONE)

Not yet handled:

  • EITS: condition selectivities computed for each table (TODO, see below)
  • "Late" ORDER BY optimization (less-defined TODO, see some notes below)
  • Join optimization tracing (TODO: how to do it?)
  • Derived table merging
  • Conversion of subqueries into semi-joins
  • Table elimination


 Comments   
Comment by Sergei Petrunia [ 2018-12-07 ]

Condition transformation (DONE)

            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "B.a < A.a",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "B.a < A.a"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "B.a < A.a"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "B.a < A.a"
                }
              ]
            }

KEYUSE objects (DONE)

            "ref_optimizer_key_uses": [
              {
                "table": "t20",
                "field": "b",
                "equals": "t21.b",
                "null_rejecting": true
              },
              {
                "table": "t21",
                "field": "a",
                "equals": "t20.c + 1",
                "null_rejecting": false
              },
              {
                "table": "t21",
                "field": "b",
                "equals": "t20.b",
                "null_rejecting": true
              }
            ]

Range access (DONE)

The biggest thing is that now one can see the ranges:

                      "range_scan_alternatives": [
                        {
                          "index": "c",
                          "ranges": [
                            "2 <= c <= 2 AND NULL < d < 10",
                            "3 <= c <= 3 AND NULL < d < 10",
                            "4 <= c <= 4 AND NULL < d < 10",
                            "5 <= c <= 5 AND NULL < d < 10"
                          ],
                          "rowid_ordered": false,
                          "using_mrr": false,
                          "index_only": false,
                          "rows": 4,
                          "cost": 8.81,
                          "chosen": true
                        }
                      ],
                      "analyzing_roworder_intersect": {
                        "cause": "too_few_roworder_scans"
                      },
                      "analyzing_index_merge_union": []
                    },
                    "chosen_range_access_summary": {
                      "range_access_plan": {
                        "type": "range_scan",
                        "index": "c",
                        "rows": 4,
                        "ranges": [
                          "2 <= c <= 2 AND NULL < d < 10",
                          "3 <= c <= 3 AND NULL < d < 10",
                          "4 <= c <= 4 AND NULL < d < 10",
                          "5 <= c <= 5 AND NULL < d < 10"
                        ]
                      },
                      "rows_for_plan": 4,
                      "cost_for_plan": 8.81,
                      "chosen": true
                    }

I don't like that the ranges for the chosen query plan are printed twice.

Comment by Sergei Petrunia [ 2018-12-07 ]

EITS: condition selectivities computed for each table

MariaDB doesn't print anything about this.

MySQL has this inside best_access_path:

                  "considered_access_paths": [
                    {
                      "rows_to_scan": 1000,
                      "filtering_effect": [
                        {
                          "condition": "(`t20`.`c` < 10)",
                          "histogram_selectivity": 0.01
                        },
                        {
                          "condition": "(`t20`.`d` < 10)",
                          "histogram_selectivity": 0.01
                        }
                      ],

but I think this is a wrong place for it.

Condition selectivity is computed once (there is no need to print it for every
best_access_path invocation), just before the range optimizer. It
should be traced there as well.

Comment by Sergei Petrunia [ 2018-12-07 ]

"Late" ORDER BY optimization

This is about test_if_cheaper_ordering function and co.

Consider an example:

CREATE TABLE `t22` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  KEY `a` (`a`),
  KEY `b` (`b`,`c`),
  KEY `b_2` (`b`,`d`),
  KEY `b_3` (`b`,`d`,`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Let's have range access over (a<100) and filesort compete with index access on
indexes starting with B. index 'b' is the shortest, but b_3 is covering.

mysql> explain select a,b,d from t22 where a<100 order by b limit 10;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t22   | NULL       | index | a             | b    | 10      | NULL |  100 |    10.00 | Using where |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+

In MySQL's optimizer trace, I can only see this:

            "reconsidering_access_paths_for_index_ordering": {
              "clause": "ORDER BY",
              "index_order_summary": {
                "table": "`t22`",
                "index_provides_order": true,
                "order_direction": "asc",
                "index": "b",
                "plan_changed": true,
                "access_type": "index"
              }
            }

which shows index 'b' was used, but I am not sure if other indexes were
considered or not.

If I change the LIMIT 10 to be LIMIT 100, I get this:

            "reconsidering_access_paths_for_index_ordering": {
              "clause": "ORDER BY",
              "index_order_summary": {
                "table": "`t22`",
                "index_provides_order": false,
                "order_direction": "undefined",
                "index": "a",
                "plan_changed": false
              }
            }
          },

which doesn't tell me if any indexes starting with b were considered or not.

Comment by Sergei Petrunia [ 2018-12-07 ]

Derived table merging

In MySQL's optimizer trace, merging of derived tables is visible like so:

explain select * from (select a,b,c,d from t22 where a<30) T where d<3;

        "select#": 1,
        "steps": [
          {
            "join_preparation": {
              ...
          },
          {
            "derived": {
              "table": "``.`` `T`",
              "select#": 2,
              "merged": true
            }
          },

Looks a bit odd but provides the information that the table was merged.

A non-mergeable case:

explain select * from (select a,b,c,d from t22 where a<30 order by c limit 10) T where d<3;

          {
            "derived": {
              "table": " `T`",
              "select#": 2,
              "materialized": true
            }
          },

Comment by Sergei Petrunia [ 2018-12-07 ]

Conversion of subqueries into semi-joins

MariaDB doesn't yet print anything.
MySQL prints something like this:

      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "join_preparation": {
              "select#": 2,
              "steps": [
                {
                  "expanded_query": "/* select#2 */ select `t21`.`b` from `t21`"
                },
                {
                  "transformation": {
                    "select#": 2,
                    "from": "IN (SELECT)",
                    "to": "semijoin",
                    "chosen": true
                  }
                }
              ]
            }

Comment by Sergei Petrunia [ 2018-12-07 ]

Table elimination

MySQL doesn't have Table Elimination. In MariaDB, we should print the names of the tables that were eliminated.

Comment by Sergei Petrunia [ 2019-01-20 ]

A note on Optimizer trace structure:

steps: { 
   join_preparation
   join_optimization
   join_explain | join_execution
}

Other possible nodes:

  • creating_tmp_table
  • ??

join_optimization : steps
  condition_processing
  substitute_generated_columns
  table_dependencies
  ref_optimizer_key_uses
  rows_estimation
  considered_execution_plans
  attaching_conditions_to_tables
  refine_plan

Comment by Varun Gupta (Inactive) [ 2019-02-13 ]

Completed as part of MDEV-6111

Generated at Thu Feb 08 08:40:12 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.