[MDEV-3798] Add EXPLAIN for UPDATE/DELETE (mwl #51) - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Fix Version/s: 10.0.5
Component/s: None
Labels:
None

Description

High-Level Specification

User interface

EXPLAIN UPDATE and EXPLAIN DELETE statements will work and produce a tabular output similar to EXPLAIN SELECT.

Implementation

The primary challenge will be to change UPDATE/DELETE code to first produce action plan and then act on it (and not make decisions on the go as it currently does).

What EXPLAIN will show

multi-table one will show the SELECT part as usual
single-table statements will show an equivalent of access method.

Besides that, we want to know

if sorting will be used
whether the UPDATE will occur on the fly or not.
print or not "Using filesort"

Low-Level Design

Possible ways of the implemetations

1. Backport of the the implementation from the mysql-5.6 code line.

Estimate: 2.5 weeks (100 hrs)

I got this estimate from comparison of the patch for ORDER BY with LIMIT that had been back-ported to 10.0-base with the mysql patch that added EXPLAIN for UPDATE/DELETE/INSERT. I took into account the fact that the latter was applied to the code that had been greatly changed from the mysql 5.5 code line. My personal attitude to the patch is extremely negative: the patch is overburdened with unneeded restructuring and unjustified new classes. The patch left the code for EXPLAIN interleaved with the code for SELECT and added interleaving of EXPLAIN for UPDATE/DELETE/INSERT for other code for this statements.

2. A new implementation that will:

Introduce a special class for abstract EXPLAIN objects that would allow an easy generation of of the EXPLAIN output in any format
Add methods that builds the EXPLAIN objects for each type of interesting statements by the chosen execution plans (plans can be changed at some fixed points during the execution)
Clean the current code for SELECT removing removing all code related to EXPLAIN.

Basically the above are the basic ideas under the low level design that I could suggest.

In the future this design will allow us to include EXPLAIN into the information schema and, as a direct consequence, will allow us to select whatever we are interested in from the EXPLAIN output.

My estimate for the time needed for this implementation is about 200 hrs (maybe even more). This is a lot, but it's only a fraction of the time that was in reality spent for SHOW EXPLAIN that DID NOT follow this design.

The steps to implement this are:

Design the class ( 1 day)
Implement the class (2 days)
Use it in DELETE to implement explain (1 day dev, 4 hour QA)
Use it in UPDATE
Change SELECT to also use the class.

In more details

1.

In my opinion a correct ideal schema of any implementation for the EXPLAIN command should be like this.

The optimizer phase for the statement is completed (it means that the execution plan is chosen)
An internal representation of the EXPLAIN info is built by the chosen plan for the executed statement
The EXPLAIN output is built in the requested format

The current implementation of the EXPLAIN command MariaDB 5.3/5.5 and in MySQL 5.5 skips 2. and builds the EXPLAIN output directly by the execution plan for the statement.

The new implementation the EXPLAIN command for MySQL 5.6 also skips 2.

Here are my consideration why step 2 is needed.

An execution plan is a volatile object (even after the optimization phase) : we can change it during the execution. So we need to save the EXPLAIN output between the commit points for this changes to support SHOW EXPLAIN. AFAIU the difficulty of the resolution of this problem for the current EXPLAIN output was the main source of troubles in the SHOW EXPLAIN project (see the log of the project in bzr).
It's much easier to form the same compact representation for different statements than to form the same EXPLAIN output
With the internal non-string representation you have more flexibility to build EXPLAIN output in a different format.

Here are some examples of the differences between the internal representations of the EXPLAIN output and the current representation of it.
Column 'possible keys':

now it's a list of key names
internal representation could contain just a key map for this key
Column 'ref:
now contains a list of external representation for the ref items
internal representation could contain just a list of references to these items

2.

I looked at the current implementation of the EXPLAIN command in MySQL 5.6.7.

Currently it contains the following new files:

opt_explain.h	(77 lines)
opt_explain.cc	(2089 lines)
opt_explain_traditional.h	(50 lines)
opt_explain_traditional.cc	(238 lines)
opt_explain_format.h	(592 lines)
opt_explain_json.h	(50 lines)
opt_explain_json.cc	(1837 lines)

Even if we omit the files for JSON it's still a lot of code.

Take also into account that:

The above files use a lot of new classes that are not present in the current MariaDB 5.5/10.0 code
Changes in other files will be required

The main functionality for the EXPLAIN command can found in opt_explain.cc.
Here we find the following classes:

/* A base for all Explain_* classes */

class Explain

/* Explain_no_table class outputs a trivial EXPLAIN row

   with "extra" column */

class Explain_no_table: public Explain

/* Explain_union_result class outputs EXPLAIN row for UNION */

class Explain_union_result : public Explain

/* Common base class for Explain_join and Explain_table */

class Explain_table_base : public Explain

/* Explain_join class produces EXPLAIN output for JOINs */

class Explain_join : public Explain_table_base

/* Explain_table class produce EXPLAIN output for queries

   without top-level JOIN */

class Explain_table: public Explain_table_base

The above hierarchy looks quite unnatural for me (if not to say more): after all the explain info go to the records of the same format.

The implementations of the methods of the classes essentially copies the old code of from select_describe() making a lot of renaming. I don't understand why this re-writing was undertaken at all.

3.

We agree that the schema I suggested should be implemented in a lazy manner: first we should get a proof of concept for the DELETE command. It will take not more than 1 week (hopefully less)

Attachments

Issue Links

blocks

MDEV-407 Print EXPLAIN [ANALYZE] in the slow query log

Closed

causes

MDEV-10674 main.show_explain failed in buildbot

Closed

relates to

MDEV-5082 EXPLAIN INSERT .. VALUES works as INSERT (adds rows to the table)

Closed

MDEV-5094 Assertion `field_types == 0 || field_types[field_pos] == MYSQL_TYPE_LONGLONG' fails on EXPLAIN PARTITIONS UPDATE

Closed

MDEV-5122 "Commands out of sync", "Malformed packet" or client hang up on unique key violation

Closed

MDEV-5154 Valgrind warnings (blocks are possibly lost) on EXPLAIN INSERT .. SELECT with Aria tables

Closed

MDEV-4864 Merge tests for EXCHANGE PARTITION feature

Closed

MDEV-5093 Assertion `field_types == 0 || field_types[field_pos] == MYSQL_TYPE_DECIMAL || ...' <cut> fails on EXPLAIN EXTENDED UPDATE

Closed

(3 relates to)

Activity

Ascending order - Click to sort in descending order

Sergei Petrunia added a comment - 2012-11-06 14:23

Decided I'll try with UPDATE (or DELETE) the following:

Save a "mini-plan" of the UPDATE in a custom structure.
Let both EXPLAIN and SHOW EXPLAIN walk it.

think of whether/how we could produce JSON output from it. (the question is: why does Oracle's way of producing traditional/json require so much code (in LOC)?)

Sergei Petrunia added a comment - 2012-11-06 14:23 Decided I'll try with UPDATE (or DELETE) the following: Save a "mini-plan" of the UPDATE in a custom structure. Let both EXPLAIN and SHOW EXPLAIN walk it. think of whether/how we could produce JSON output from it. (the question is: why does Oracle's way of producing traditional/json require so much code (in LOC)?)

Sergei Petrunia added a comment - 2012-11-12 15:08

Started to code the above. No visible output yet.

Sergei Petrunia added a comment - 2012-11-12 15:08 Started to code the above. No visible output yet.

Sergei Petrunia added a comment - 2012-12-11 13:53

Committed the patch with working EXPLAIN DELETE. SHOW EXPLAIN for DELETE commands works, too. Need to discuss it.

Sergei Petrunia added a comment - 2012-12-11 13:53 Committed the patch with working EXPLAIN DELETE. SHOW EXPLAIN for DELETE commands works, too. Need to discuss it.

Sergei Petrunia added a comment - 2013-05-30 22:33

Current status: [SHOW] EXPLAIN (UPDATE|DELETE) works, the tree is at lp:~maria-captains/maria/10.0-base-explain-update/.

Sergei Petrunia added a comment - 2013-05-30 22:33 Current status: [SHOW] EXPLAIN (UPDATE|DELETE) works, the tree is at lp:~maria-captains/maria/10.0-base-explain-update/.

Sergei Petrunia added a comment - 2013-05-30 22:34

Had another discussion about the issue. Conclusions are as follows:

== Goals ==
The goals of the data structure are

it should allow to generate both tabular and JSON-like output.
it should survive JOIN::cleanup()

The structure should be cheap to generate. Most likely, the structure will be
generated at the end of optimization phase. One needs to note that "end of
optimization phase" may be tricky for SELECTs (which optimize the subqueries
lazily). Because of that, one must be prepared that the data structure is
constructed incrementally in whatever order the optimizer happens to produce
query plans for parts of the query.

== Solution ==

We will have our own tree of SELECTs/UNIONs (the other option was to introduce
a navigation API over existing SELECT_LEX trees).

The structure will be a tree of SELECTs/UNIONs.

A select is defined as
select {
select_id,
select_type
ordered_list < table_access > tables;
}

table_access {

All the fields that a row has in tabular output form
}

table_access fields will not be string representation. They will be pointers, or
"source data"

table_name will point to table's name
"key_length" will be a list of integers (index_merge has a list of
lengths)
"key" will be a key number
"Extra" will be a set of tags for "Using where", "Using index condition",
"Using index", etc, etc.

SJM nests will not have a separate "select" element. Instead, table_access is
permitted to have extra fields pointing (or storing indexes of) other
table_access elements.

Sergei Petrunia added a comment - 2013-05-30 22:34 Had another discussion about the issue. Conclusions are as follows: == Goals == The goals of the data structure are it should allow to generate both tabular and JSON-like output. it should survive JOIN::cleanup() The structure should be cheap to generate. Most likely, the structure will be generated at the end of optimization phase. One needs to note that "end of optimization phase" may be tricky for SELECTs (which optimize the subqueries lazily). Because of that, one must be prepared that the data structure is constructed incrementally in whatever order the optimizer happens to produce query plans for parts of the query. == Solution == We will have our own tree of SELECTs/UNIONs (the other option was to introduce a navigation API over existing SELECT_LEX trees). The structure will be a tree of SELECTs/UNIONs. A select is defined as select { select_id, select_type ordered_list < table_access > tables; } table_access { All the fields that a row has in tabular output form } table_access fields will not be string representation. They will be pointers, or "source data" table_name will point to table's name "key_length" will be a list of integers (index_merge has a list of lengths) "key" will be a key number "Extra" will be a set of tags for "Using where", "Using index condition", "Using index", etc, etc. SJM nests will not have a separate "select" element. Instead, table_access is permitted to have extra fields pointing (or storing indexes of) other table_access elements.

Sergei Petrunia added a comment - 2013-08-26 12:40

After merge with 10.0-base, got conflict with DELETE ... RETURNING code

Sergei Petrunia added a comment - 2013-08-26 12:40 After merge with 10.0-base, got conflict with DELETE ... RETURNING code

Sergei Petrunia added a comment - 2013-09-18 09:54

The second patch passes the tests. Submitted it for review.

Sergei Petrunia added a comment - 2013-09-18 09:54 The second patch passes the tests. Submitted it for review.

Sergei Petrunia added a comment - 2013-11-01 13:48

Pushed into 10.0-base and 10.0

Sergei Petrunia added a comment - 2013-11-01 13:48 Pushed into 10.0-base and 10.0

People

Assignee:: Sergei Petrunia

Reporter:: Igor Babaev (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2012-10-05 20:20

Updated:: 2020-08-14 06:18

Resolved:: 2013-11-01 13:48

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server