Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17929

Optimizer trace support for different parts of the server

Details

    Description

      This task is for checking how optimizer trace uses different parts of the server.

      Done:

      • Condition transformation (DONE)
      • KEYUSE objects (DONE)
      • Range access (DONE)

      Not yet handled:

      • EITS: condition selectivities computed for each table (TODO, see below)
      • "Late" ORDER BY optimization (less-defined TODO, see some notes below)
      • Join optimization tracing (TODO: how to do it?)
      • Derived table merging
      • Conversion of subqueries into semi-joins
      • Table elimination

      Attachments

        Issue Links

          Activity

            Condition transformation (DONE)

                        "condition_processing": {
                          "condition": "WHERE",
                          "original_condition": "B.a < A.a",
                          "steps": [
                            {
                              "transformation": "equality_propagation",
                              "resulting_condition": "B.a < A.a"
                            },
                            {
                              "transformation": "constant_propagation",
                              "resulting_condition": "B.a < A.a"
                            },
                            {
                              "transformation": "trivial_condition_removal",
                              "resulting_condition": "B.a < A.a"
                            }
                          ]
                        }
            

            KEYUSE objects (DONE)

                        "ref_optimizer_key_uses": [
                          {
                            "table": "t20",
                            "field": "b",
                            "equals": "t21.b",
                            "null_rejecting": true
                          },
                          {
                            "table": "t21",
                            "field": "a",
                            "equals": "t20.c + 1",
                            "null_rejecting": false
                          },
                          {
                            "table": "t21",
                            "field": "b",
                            "equals": "t20.b",
                            "null_rejecting": true
                          }
                        ]
            

            Range access (DONE)

            The biggest thing is that now one can see the ranges:

                                  "range_scan_alternatives": [
                                    {
                                      "index": "c",
                                      "ranges": [
                                        "2 <= c <= 2 AND NULL < d < 10",
                                        "3 <= c <= 3 AND NULL < d < 10",
                                        "4 <= c <= 4 AND NULL < d < 10",
                                        "5 <= c <= 5 AND NULL < d < 10"
                                      ],
                                      "rowid_ordered": false,
                                      "using_mrr": false,
                                      "index_only": false,
                                      "rows": 4,
                                      "cost": 8.81,
                                      "chosen": true
                                    }
                                  ],
                                  "analyzing_roworder_intersect": {
                                    "cause": "too_few_roworder_scans"
                                  },
                                  "analyzing_index_merge_union": []
                                },
                                "chosen_range_access_summary": {
                                  "range_access_plan": {
                                    "type": "range_scan",
                                    "index": "c",
                                    "rows": 4,
                                    "ranges": [
                                      "2 <= c <= 2 AND NULL < d < 10",
                                      "3 <= c <= 3 AND NULL < d < 10",
                                      "4 <= c <= 4 AND NULL < d < 10",
                                      "5 <= c <= 5 AND NULL < d < 10"
                                    ]
                                  },
                                  "rows_for_plan": 4,
                                  "cost_for_plan": 8.81,
                                  "chosen": true
                                }
            

            I don't like that the ranges for the chosen query plan are printed twice.

            psergei Sergei Petrunia added a comment - Condition transformation (DONE) "condition_processing": { "condition": "WHERE", "original_condition": "B.a < A.a", "steps": [ { "transformation": "equality_propagation", "resulting_condition": "B.a < A.a" }, { "transformation": "constant_propagation", "resulting_condition": "B.a < A.a" }, { "transformation": "trivial_condition_removal", "resulting_condition": "B.a < A.a" } ] } KEYUSE objects (DONE) "ref_optimizer_key_uses": [ { "table": "t20", "field": "b", "equals": "t21.b", "null_rejecting": true }, { "table": "t21", "field": "a", "equals": "t20.c + 1", "null_rejecting": false }, { "table": "t21", "field": "b", "equals": "t20.b", "null_rejecting": true } ] Range access (DONE) The biggest thing is that now one can see the ranges: "range_scan_alternatives": [ { "index": "c", "ranges": [ "2 <= c <= 2 AND NULL < d < 10", "3 <= c <= 3 AND NULL < d < 10", "4 <= c <= 4 AND NULL < d < 10", "5 <= c <= 5 AND NULL < d < 10" ], "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 4, "cost": 8.81, "chosen": true } ], "analyzing_roworder_intersect": { "cause": "too_few_roworder_scans" }, "analyzing_index_merge_union": [] }, "chosen_range_access_summary": { "range_access_plan": { "type": "range_scan", "index": "c", "rows": 4, "ranges": [ "2 <= c <= 2 AND NULL < d < 10", "3 <= c <= 3 AND NULL < d < 10", "4 <= c <= 4 AND NULL < d < 10", "5 <= c <= 5 AND NULL < d < 10" ] }, "rows_for_plan": 4, "cost_for_plan": 8.81, "chosen": true } I don't like that the ranges for the chosen query plan are printed twice.

            EITS: condition selectivities computed for each table

            MariaDB doesn't print anything about this.

            MySQL has this inside best_access_path:

                              "considered_access_paths": [
                                {
                                  "rows_to_scan": 1000,
                                  "filtering_effect": [
                                    {
                                      "condition": "(`t20`.`c` < 10)",
                                      "histogram_selectivity": 0.01
                                    },
                                    {
                                      "condition": "(`t20`.`d` < 10)",
                                      "histogram_selectivity": 0.01
                                    }
                                  ],
            

            but I think this is a wrong place for it.

            Condition selectivity is computed once (there is no need to print it for every
            best_access_path invocation), just before the range optimizer. It
            should be traced there as well.

            psergei Sergei Petrunia added a comment - EITS: condition selectivities computed for each table MariaDB doesn't print anything about this. MySQL has this inside best_access_path: "considered_access_paths": [ { "rows_to_scan": 1000, "filtering_effect": [ { "condition": "(`t20`.`c` < 10)", "histogram_selectivity": 0.01 }, { "condition": "(`t20`.`d` < 10)", "histogram_selectivity": 0.01 } ], but I think this is a wrong place for it. Condition selectivity is computed once (there is no need to print it for every best_access_path invocation), just before the range optimizer. It should be traced there as well.

            "Late" ORDER BY optimization

            This is about test_if_cheaper_ordering function and co.

            Consider an example:

            CREATE TABLE `t22` (
              `a` int(11) DEFAULT NULL,
              `b` int(11) DEFAULT NULL,
              `c` int(11) DEFAULT NULL,
              `d` int(11) DEFAULT NULL,
              KEY `a` (`a`),
              KEY `b` (`b`,`c`),
              KEY `b_2` (`b`,`d`),
              KEY `b_3` (`b`,`d`,`a`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
            

            Let's have range access over (a<100) and filesort compete with index access on
            indexes starting with B. index 'b' is the shortest, but b_3 is covering.

            mysql> explain select a,b,d from t22 where a<100 order by b limit 10;
            +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
            | id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
            +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
            |  1 | SIMPLE      | t22   | NULL       | index | a             | b    | 10      | NULL |  100 |    10.00 | Using where |
            +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
            

            In MySQL's optimizer trace, I can only see this:

                        "reconsidering_access_paths_for_index_ordering": {
                          "clause": "ORDER BY",
                          "index_order_summary": {
                            "table": "`t22`",
                            "index_provides_order": true,
                            "order_direction": "asc",
                            "index": "b",
                            "plan_changed": true,
                            "access_type": "index"
                          }
                        }
            

            which shows index 'b' was used, but I am not sure if other indexes were
            considered or not.

            If I change the LIMIT 10 to be LIMIT 100, I get this:

                        "reconsidering_access_paths_for_index_ordering": {
                          "clause": "ORDER BY",
                          "index_order_summary": {
                            "table": "`t22`",
                            "index_provides_order": false,
                            "order_direction": "undefined",
                            "index": "a",
                            "plan_changed": false
                          }
                        }
                      },
            

            which doesn't tell me if any indexes starting with b were considered or not.

            psergei Sergei Petrunia added a comment - "Late" ORDER BY optimization This is about test_if_cheaper_ordering function and co. Consider an example: CREATE TABLE `t22` ( `a` int (11) DEFAULT NULL , `b` int (11) DEFAULT NULL , `c` int (11) DEFAULT NULL , `d` int (11) DEFAULT NULL , KEY `a` (`a`), KEY `b` (`b`,`c`), KEY `b_2` (`b`,`d`), KEY `b_3` (`b`,`d`,`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE =utf8mb4_0900_ai_ci; Let's have range access over (a<100) and filesort compete with index access on indexes starting with B. index 'b' is the shortest, but b_3 is covering. mysql> explain select a,b,d from t22 where a<100 order by b limit 10; +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | t22 | NULL | index | a | b | 10 | NULL | 100 | 10.00 | Using where | +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+ In MySQL's optimizer trace, I can only see this: "reconsidering_access_paths_for_index_ordering": { "clause": "ORDER BY", "index_order_summary": { "table": "`t22`", "index_provides_order": true, "order_direction": "asc", "index": "b", "plan_changed": true, "access_type": "index" } } which shows index 'b' was used, but I am not sure if other indexes were considered or not. If I change the LIMIT 10 to be LIMIT 100, I get this: "reconsidering_access_paths_for_index_ordering": { "clause": "ORDER BY", "index_order_summary": { "table": "`t22`", "index_provides_order": false, "order_direction": "undefined", "index": "a", "plan_changed": false } } }, which doesn't tell me if any indexes starting with b were considered or not.
            psergei Sergei Petrunia added a comment - - edited

            Derived table merging

            In MySQL's optimizer trace, merging of derived tables is visible like so:

            explain select * from (select a,b,c,d from t22 where a<30) T where d<3;
            

                    "select#": 1,
                    "steps": [
                      {
                        "join_preparation": {
                          ...
                      },
                      {
                        "derived": {
                          "table": "``.`` `T`",
                          "select#": 2,
                          "merged": true
                        }
                      },
            

            Looks a bit odd but provides the information that the table was merged.

            A non-mergeable case:

            explain select * from (select a,b,c,d from t22 where a<30 order by c limit 10) T where d<3;
            

                      {
                        "derived": {
                          "table": " `T`",
                          "select#": 2,
                          "materialized": true
                        }
                      },
            

            psergei Sergei Petrunia added a comment - - edited Derived table merging In MySQL's optimizer trace, merging of derived tables is visible like so: explain select * from ( select a,b,c,d from t22 where a<30) T where d<3; "select#": 1, "steps": [ { "join_preparation": { ... }, { "derived": { "table": "``.`` `T`", "select#": 2, "merged": true } }, Looks a bit odd but provides the information that the table was merged. A non-mergeable case: explain select * from ( select a,b,c,d from t22 where a<30 order by c limit 10) T where d<3; { "derived": { "table": " `T`", "select#": 2, "materialized": true } },

            Conversion of subqueries into semi-joins

            MariaDB doesn't yet print anything.
            MySQL prints something like this:

                  "join_preparation": {
                    "select#": 1,
                    "steps": [
                      {
                        "join_preparation": {
                          "select#": 2,
                          "steps": [
                            {
                              "expanded_query": "/* select#2 */ select `t21`.`b` from `t21`"
                            },
                            {
                              "transformation": {
                                "select#": 2,
                                "from": "IN (SELECT)",
                                "to": "semijoin",
                                "chosen": true
                              }
                            }
                          ]
                        }
            

            psergei Sergei Petrunia added a comment - Conversion of subqueries into semi-joins MariaDB doesn't yet print anything. MySQL prints something like this: "join_preparation": { "select#": 1, "steps": [ { "join_preparation": { "select#": 2, "steps": [ { "expanded_query": "/* select#2 */ select `t21`.`b` from `t21`" }, { "transformation": { "select#": 2, "from": "IN (SELECT)", "to": "semijoin", "chosen": true } } ] }

            Table elimination

            MySQL doesn't have Table Elimination. In MariaDB, we should print the names of the tables that were eliminated.

            psergei Sergei Petrunia added a comment - Table elimination MySQL doesn't have Table Elimination. In MariaDB, we should print the names of the tables that were eliminated.

            A note on Optimizer trace structure:

            steps: { 
               join_preparation
               join_optimization
               join_explain | join_execution
            }
            

            Other possible nodes:

            • creating_tmp_table
            • ??

            join_optimization : steps
              condition_processing
              substitute_generated_columns
              table_dependencies
              ref_optimizer_key_uses
              rows_estimation
              considered_execution_plans
              attaching_conditions_to_tables
              refine_plan
            

            psergei Sergei Petrunia added a comment - A note on Optimizer trace structure: steps: { join_preparation join_optimization join_explain | join_execution } Other possible nodes: creating_tmp_table ?? join_optimization : steps condition_processing substitute_generated_columns table_dependencies ref_optimizer_key_uses rows_estimation considered_execution_plans attaching_conditions_to_tables refine_plan

            Completed as part of MDEV-6111

            varun Varun Gupta (Inactive) added a comment - Completed as part of MDEV-6111

            People

              varun Varun Gupta (Inactive)
              psergei Sergei Petrunia
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.