[MDEV-37271] Optimizer Trace Replay: dump and load statistics part 2 - Jira

XML

Word

Printable

Details

Type: Task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: 13.0
Component/s: Optimizer
Labels:
None

Sprint:
Q4/2025 Server Development

Description

This task covers saving of optimization context not saved in MDEV-36511.

Optimizer context capture

Possible options for user interface:
Variant 1: dump to file:

  set optimizer_context_capture='/path/to/capture_file';

  explain select ... from view1 .... where ...;

Variant 2: dump to an INFORMATION_SCHEMA tables:

  set optimizer_context_capture=1;

  explain select ... from view1 .... where ...;

  select * from information_schema.optimizer_context_capture;

Variant 3: Write the context into a log table

  set optimizer_context_capture=1;

  explain select ... from view1 .... where ...;

  select * from mysql.context_capture_logs;

In all variants, everything needed to replay the execution is stored as one blob that can be used to replay the file.

Capture file data format

It seems, the best format for optimizer context file would be an SQL script. Let me list the contents of the file to
justify this point.

# Section 1: Set optimizer settings

set optimizer_variable1= ...;

set optimizer_variable2= ...;

...

# # Section 2: Histograms and other engine-independent statistics

insert into mysql.column_stats values

('db1', 'table1', 'column1', 'histogram-data'), ...;

# Section 3: create tables with captured statistics

# Put the JSON data describing table parameters into a user variable

set @table1_stats = '

  "records": 123,

  "scan_cost": 123.45,

  "indexes" : [

      "index_name": "IDX1",

      "rec_per_key": [123, 1230, 123000],

      "records_in_range" : [

          "min": "keypart1_value1, keypart2_value",

          "min_flag" : 1,

          "max": "keypart1_value",

          "max_flag" : 0,

          "records" : 12300

},

        ....

...

}';

# Now, create the table. Note the parameters:

#  ENGINE=mimic_replay

#  mimic_target_engine=innodb

#  mimic_parameters=@table1_stats

#  I am not sure if this is the most elegant way to pass

#  the needed data to the storage engine but it does what

#  is needed and doesn't require changes to the parser.

CREATE TABLE db1.table1 (

  <here goes table definition>

) ENGINE=mimic_replay mimic_target_engine=innodb mimic_parameters=@table1_stats;

...

# Section 3.1: needed table data

Data for constant tables and tables using MIN/MAX rewrite

This can be just a list of rows that the optimizer has read:

INSERT INTO db1.table2 VALUES

  ('row1', 'row1-data'),

  ...;

If the optimizer has tried to read a row but didn't find it, it is not listed.
TODO: Is this really fine or we should also track the row reads
that have returned "row not found", too?

# Section 4: Create VIEWs if they are needed

CREATE VIEW view1 AS SELECT ...;

# Section 5: Text of the query for which the optimizer

#   context was recorded:

EXPLAIN

SELECT ... FROM view1 ... WHERE ....;

Replaying the capture

Just source the above capture file into mariadb client.
The last command should produce the same query plan that was observed when capturing.

Attachments

Issue Links

is part of

MDEV-27189 Optimizer Trace: save all info needed for replay

Open

relates to

MDEV-36511 Optimizer Trace replay step #2: dump basic statistics

Stalled

Activity

People

Assignee:: Srikanth Bondalapati

Reporter:: Sergei Petrunia

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2025-07-18 06:23

Updated:: 2025-12-16 18:23

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.