[MDEV-6111] optimizer trace Created: 2014-04-15  Updated: 2023-05-16  Resolved: 2019-02-18

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: 10.4.3

Type: Task Priority: Critical
Reporter: Sergei Golubchik Assignee: Varun Gupta (Inactive)
Resolution: Fixed Votes: 10
Labels: gsoc18, optimizer

Attachments: Text File mdev6111-example-trace-from-mysql8.txt     PNG File screenshot-1.png    
Issue Links:
Blocks
blocks MDEV-7239 explain index choice Closed
PartOf
includes MDEV-17929 Optimizer trace support for different... Closed
includes MDEV-17930 Optimizer trace: what's on the execut... Closed
includes MDEV-18117 Crash with Explain extended when usin... Closed
includes MDEV-18172 Benchmarking 10.4 for optimizer trace Closed
includes MDEV-18306 Optimizer_trace has invalid JSON near... Closed
includes MDEV-18489 Limit the memory used by the optimize... Closed
Relates
relates to MDEV-12980 Index hints for loose scan Confirmed
relates to MDEV-18670 Document optimizer trace Closed
relates to MDEV-20440 Optimizer trace: print more details a... Closed
relates to MDEV-18527 Optimizer trace for DELETE query show... Closed
relates to MDEV-18528 Optimizer trace support for multi-tab... Closed
relates to MDEV-18741 Optimizer trace: multi-part key range... Closed
relates to MDEV-18822 investigate performance regressions i... Stalled
relates to MDEV-19160 JSON_DETAILED output unnecessarily ve... Closed
relates to MDEV-20444 More information regarding access of ... Closed

 Description   

In MySQL, Optimizer trace is a JSON object recording the execution path through the optimizer, decisions that were made and the reasons for them. See https://dev.mysql.com/doc/internals/en/optimizer-tracing.html

Users were asking for MariaDB to have a similar feature.



 Comments   
Comment by Kirill Smelkov [ 2016-09-21 ]

Hello up there MariaDB team,

I'm currently debugging performance anomalies in MariaDB under NEO,
when same queries are usually executed fast, but sometimes badly slow. I've
discovered that some (not all) of such cases happen because wrong query
execution plan is somehow selected by optimizer. Why optimizer makes such
decision is a question.

For simple cases in a couple of times we can hopefully paper the problem over
with things like FORCE INDEX, but since this issue is starting to appear
regularly in many different places it is better to understand the root cause
and fix problem there.

That's where optimizer trace would be very handy - to get info from the system
about why such-and-such query execution decision has been made. With this
message I'd like to provide to MariaDB team feedback that optimizer trace is
needed in real-life since on production instances we cannot use debug builds
and get info from DBUG.

I provide examples of some anomalies in appendix for the reference.

Thanks beforehand,
Kirill

P.S. My familiarity with MariaDB is only several days, so please forgive me in
advance some possibly silly mistakes.

/cc ratzpo (@ratzpo)


Appendix

MariaDB 10.1.17. TokuDB is used as storage engine.

Example of bad query caught in slow log:

# Time: 160921 10:47:15
# User@Host: root[root] @ localhost []
# Thread_id: 6  Schema: neo1  QC_hit: No
# Query_time: 4.118192  Lock_time: 0.000024  Rows_sent: 0  Rows_examined: 8476058
# Rows_affected: 0
#
# explain: id   select_type     table   type    possible_keys       key             key_len ref             rows    r_rows          filtered        r_filtered      Extra
# explain: 1    SIMPLE          obj     ref     PRIMARY,partition   partition       10      const,const     1       8476058.00      100.00          0.00            Using where; Using index
#
SET timestamp=1474447635;
SELECT tid FROM obj WHERE `partition`=5 AND oid=79613 AND tid>268544235197088772 ORDER BY tid LIMIT 1;

Similar query caught in process list with analyze:

MariaDB [neo1]> show processlist;
+----+------+-----------+------+---------+------+----------------------------+------------------------------------------------------------------------------------------------------+----------+
| Id | User | Host      | db   | Command | Time | State                      | Info                                                                                                 | Progress |
+----+------+-----------+------+---------+------+----------------------------+------------------------------------------------------------------------------------------------------+----------+
|  3 | root | localhost | neo0 | Sleep   |    8 |                            | NULL                                                                                                 |    0.000 |
|  4 | root | localhost | neo2 | Sleep   |   15 |                            | NULL                                                                                                 |    0.000 |
|  5 | root | localhost | neo3 | Sleep   |   11 |                            | NULL                                                                                                 |    0.000 |
|  6 | root | localhost | neo1 | Query   |    3 | Queried about 7710000 rows | SELECT tid FROM obj WHERE `partition`=5 AND oid=79613 AND tid>268544341634012678 ORDER BY tid LIMIT  |    0.000 |
| 10 | root | localhost | neo1 | Query   |    0 | init                       | show processlist                                                                                     |    0.000 |
+----+------+-----------+------+---------+------+----------------------------+------------------------------------------------------------------------------------------------------+----------+
5 rows in set (0.00 sec)
 
MariaDB [neo1]> show explain for 6;
+------+-------------+-------+------+-------------------+-----------+---------+-------------+------+------------------------------------------+
| id   | select_type | table | type | possible_keys     | key       | key_len | ref         | rows | Extra                                    |
+------+-------------+-------+------+-------------------+-----------+---------+-------------+------+------------------------------------------+
|    1 | SIMPLE      | obj   | ref  | PRIMARY,partition | partition | 10      | const,const |    1 | Using where; Using index; Using filesort |
+------+-------------+-------+------+-------------------+-----------+---------+-------------+------+------------------------------------------+
1 row in set, 1 warning (0.01 sec)

The same query in normal case:

MariaDB [neo1]> explain SELECT tid FROM obj WHERE `partition`=5 AND oid=79613 AND tid>268544341634012678 ORDER BY tid LIMIT 1;
+------+-------------+-------+-------+-------------------+-----------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys     | key       | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+-------------------+-----------+---------+------+------+--------------------------+
|    1 | SIMPLE      | obj   | range | PRIMARY,partition | partition | 18      | NULL | 1295 | Using where; Using index |
+------+-------------+-------+-------+-------------------+-----------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
 
MariaDB [neo1]> analyze SELECT tid FROM obj WHERE `partition`=5 AND oid=79613 AND tid>268544341634012678 ORDER BY tid LIMIT 1;
+------+-------------+-------+-------+-------------------+-----------+---------+------+------+--------+----------+------------+--------------------------+
| id   | select_type | table | type  | possible_keys     | key       | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra                    |
+------+-------------+-------+-------+-------------------+-----------+---------+------+------+--------+----------+------------+--------------------------+
|    1 | SIMPLE      | obj   | range | PRIMARY,partition | partition | 18      | NULL |  639 |   1.00 |   100.00 |     100.00 | Using where; Using index |
+------+-------------+-------+-------+-------------------+-----------+---------+------+------+--------+----------+------------+--------------------------+
1 row in set (0.00 sec)
 
MariaDB [neo1]> SELECT tid FROM obj WHERE `partition`=5 AND oid=79613 AND tid>268544341634012678 ORDER BY tid LIMIT 1;
+--------------------+
| tid                |
+--------------------+
| 268544342862377552 |
+--------------------+
1 row in set (0.00 sec)

NOTE the difference:

  • time: 4s vs < 0.01s
  • type: 'ref' vs 'range'
  • key_len: 10 vs 18 (partition, oid) vs (partition, oid, tid)
  • ref: const,const vs NULL
  • "Using filesort"

For the reference:

MariaDB [neo1]> describe obj;
+-----------+----------------------+------+-----+---------+-------+
| Field     | Type                 | Null | Key | Default | Extra |
+-----------+----------------------+------+-----+---------+-------+
| partition | smallint(5) unsigned | NO   | PRI | NULL    |       |
| oid       | bigint(20) unsigned  | NO   | PRI | NULL    |       |
| tid       | bigint(20) unsigned  | NO   | PRI | NULL    |       |
| data_id   | bigint(20) unsigned  | YES  | MUL | NULL    |       |
| value_tid | bigint(20) unsigned  | YES  |     | NULL    |       |
+-----------+----------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
 
MariaDB [neo1]> show index from obj;
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| obj   |          0 | PRIMARY   |            1 | partition   | A         |    28253069 |     NULL | NULL   |      | BTREE      |         |               |
| obj   |          0 | PRIMARY   |            2 | tid         | A         |    28253069 |     NULL | NULL   |      | BTREE      |         |               |
| obj   |          0 | PRIMARY   |            3 | oid         | A         |    28253069 |     NULL | NULL   |      | BTREE      |         |               |
| obj   |          1 | partition |            1 | partition   | A         |    28253069 |     NULL | NULL   |      | BTREE      |         |               |
| obj   |          1 | partition |            2 | oid         | A         |    28253069 |     NULL | NULL   |      | BTREE      |         |               |
| obj   |          1 | partition |            3 | tid         | A         |    28253069 |     NULL | NULL   |      | BTREE      |         |               |
| obj   |          1 | data_id   |            1 | data_id     | A         |    28253069 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
7 rows in set (0.01 sec)
 
MariaDB [neo1]> show table status like 'obj'\G
*************************** 1. row ***************************
           Name: obj
         Engine: TokuDB
        Version: 10
     Row_format: Fixed
           Rows: 28253276
 Avg_row_length: 33
    Data_length: 957477808
Max_data_length: 9223372036854775807
   Index_length: 1779148337
      Data_free: 913293312
 Auto_increment: NULL
    Create_time: 2016-08-16 10:33:02
    Update_time: 2016-09-21 22:05:35
     Check_time: NULL
      Collation: utf8_unicode_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.00 sec)

Another example:

https://lab.nexedi.com/nexedi/neoppod/commit/13911ca3

Comment by Sergei Petrunia [ 2018-05-04 ]

Some ideas about features we should have: https://lists.launchpad.net/maria-developers/msg11266.html

Comment by Sergei Petrunia [ 2018-05-04 ]

Ideas about features/milestones: https://lists.launchpad.net/maria-developers/msg11267.html

Comment by Varun Gupta (Inactive) [ 2018-09-26 ]

Takes from MYSQL's implementation

INFOMATMATION_SCHEMA.OPTIMIZER_TRACE

The table has the following fields

  • QUERY
  • Trace
  • MISSING_BYTES_BEYOND_MAX_MEM_SIZE
  • INSUFFICIENT_PRIVILEGES

Queries to trace

They are: SELECT; INSERT or REPLACE (with VALUES or SELECT); UPDATE/DELETE and their multi-table variants; all the previous ones prefixed by EXPLAIN; SET (unless it manipulates the optimizer_trace system variable); DO; DECLARE/CASE/IF/RETURN (stored routines language elements); CALL. If one of those statements is prepared and executed in separate steps, preparation and execution are separately traced.

Trace purging

By default, each new trace overwrites the previous trace. Thus, if a statement contains substatements (example: invokes stored procedures, stored functions, triggers), the top statement and substatements each generate one trace, but at the execution's end only the last substatement's trace is visible. A user who wants to see the trace of another substatement, can enable/disable tracing around the desired substatement, but this requires editing the routine's code, which may not be possible. Another solution is to tune trace purging.

SET optimizer_trace_offset=<OFFSET>, optimizer_trace_limit=<LIMIT>

The optimizer_trace System Variable

  • enabled: allows to enable/disable tracing
    During disabling, we need to make sure that we remove DBUG_PRINT and use the optimizer trace to write to the trace file. Avoid duplication of code.
  • one_line: if on, the trace will have no whitespace; it's unreadable for humans but readable for JSON parsers (they ignore whitespace); the only advantage is a saving on space.

Selecting Optimizer Features to Trace

Those features can be excluded from tracing using the optimizer_trace_features system variable, which has these on/off switches

Security

  • INFORMATION_SCHEMA.OPTIMIZER_TRACE.INSUFFICIENT_PRIVILEGES
  • Who can view optimizer_trace?

INFORMATION_SCHEMA.OPTIMIZER_TRACE.INSUFFICIENT_PRIVILEGE
If a traced query uses views or stored routines that have SQL SECURITY with a value of DEFINER, it may be that a user other than the definer is denied from seeing the trace of the query. In that case, the trace is shown as empty and INSUFFICIENT_PRIVILEGES has a value of 1. Otherwise, the value is 0.

Comment by Varun Gupta (Inactive) [ 2018-10-30 ]

The branch is 10.4-mdev6111

Comment by Sergei Petrunia [ 2018-11-02 ]

https://jira.mariadb.org/browse/MDEV-13744 is an example where opt trace would be useful- to check whether loose scan was considered (and see that it was, but was discarded because it wasn't worth it)

Comment by Sergei Petrunia [ 2018-11-14 ]

Takeaways from yesterday discussion:

  • It would be nice to select parts of the trace somehow.
  • MySQL allows to control whether certain parts of the optimizer emit the trace: it has @@optimizer_trace_features where one can tune these flags: greedy_search,range_optimizer,dynamic_range,repeated_subselect.
  • Can one use JSON functions to select parts of the trace?
Comment by Sergei Petrunia [ 2018-11-14 ]

Example from the call: range optimization. A trace from mysql-8: [^mdev6111-example-trace-from-mysql8.json] .
There is a JSONPath language which allows to select all range_analysis nodes like so: $..range_analysis (you can check this online at http://jsonpath.com/). It is not clear which of the objects refers to which table, though.

Comment by Sergei Petrunia [ 2018-11-14 ]

JSON example I was using: [^mdev6111-example-trace-from-mysql8.json.txt]

Comment by Sergei Petrunia [ 2018-11-14 ]

Another attempt mdev6111-example-trace-from-mysql8.txt

Comment by Sergei Petrunia [ 2018-11-14 ]

... However, MariaDB's JSON function do not allow to compute JSON paths? I tried JSON_EXTRACT and others, none could achieve what I wanted - extract all nodes in the document with the name range_analysis.

Comment by Sergei Petrunia [ 2018-11-14 ]

As for pretty-printing, there is JSON_DETAILED function but its output is not as beautiful as the one that EXPLAIN FORMAT=JSON pretty-printer produces.

Example:

 
 
                                            "key_parts": 
                                            [
                                                "a"
                                            ]
                                        }
                                    ],
                                    "setup_range_conditions": 
                                    [
                                    ],

Comment by Sergei Petrunia [ 2018-11-14 ]

WIth holyfoot's and Serg's input:

mysql> select  JSON_DETAILED(JSON_EXTRACT(a, '$**.range_analysis')) from t20\G
*************************** 1. row ***************************
JSON_DETAILED(JSON_EXTRACT(a, '$**.range_analysis')): [
    
    {
        "table_scan": 
        {
            "rows": 1000,
            "cost": 103.1
        },
        "potential_range_indexes": 
        [
            
            {
                "index": "a",
                "usable": true,
                "key_parts": 
                [
                    "a"
                ]
            }
        ],
        "setup_range_conditions": 
        [
        ],

so, it is possible to extract parts of trace using JSON function. You'll need to re-format them, and formatter is not perfect, but still.

Comment by Sergei Petrunia [ 2019-08-13 ]

Pushed this:

https://github.com/MariaDB/server/commit/2dbe472ed011a951b28434ae8e67945e964d2030

Optimizer trace: print cost and #rows of the join prefix
 
The names rows_for_plan and cost_for_plan follow MySQL
Also added post-join-operation selectivity cost

Comment by Sergei Petrunia [ 2019-08-25 ]

Pushed this:

http://lists.askmonty.org/pipermail/commits/2019-August/013945.html

http://lists.askmonty.org/pipermail/commits/2019-August/013945.html
commit 4a490d1a993959222deb8bae822f40575586a148 (HEAD -> 10.4, origin/HEAD, origin/10.4, tmp10)
Author: Sergei Petrunia <psergey@askmonty.org>
Date:   Sun Aug 25 11:03:19 2019 +0300
 
    MDEV-6111: Optimizer Trace: add tracing for semi-join optimizations
    
    Added:
    - "semijoin_strategy_choice" element (actions in advance_sj_state(), name
      matches the name in MySQL)
    
    - semijoin_table_pullout element.

UPDATE: filed a MDEV to track the above patch: MDEV-20440.

Generated at Thu Feb 08 07:09:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.