Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-37271

Optimizer Trace Replay: dump statistics part 2

    XMLWordPrintable

Details

    • Task
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 12.3
    • Optimizer
    • None
    • Q4/2025 Server Development

    Description

      This task covers saving of optimization context not saved in MDEV-36511.

      Optimizer context capture

      Possible options for user interface:
      Variant 1: dump to file:

        set optimizer_context_capture='/path/to/capture_file';
        explain select ... from view1 .... where ...; 
      

      Variant 2: dump to an INFORMATION_SCHEMA tables:

        set optimizer_context_capture=1;
        explain select ... from view1 .... where ...; 
        select * from information_schema.optimizer_context_capture;
      

      Variant 3: Write the context into a log table

        set optimizer_context_capture=1;
        explain select ... from view1 .... where ...; 
        select * from mysql.context_capture_logs;
      

      In all variants, everything needed to replay the execution is stored as one blob that can be used to replay the file.

      Capture file data format

      It seems, the best format for optimizer context file would be an SQL script. Let me list the contents of the file to
      justify this point.

      # Section 1: Set optimizer settings
      set optimizer_variable1= ...;
      set optimizer_variable2= ...;
      ...
      

      # # Section 2: Histograms and other engine-independent statistics
      insert into mysql.column_stats values
      ('db1', 'table1', 'column1', 'histogram-data'), ...;
      

      # Section 3: create tables with captured statistics 
      # Put the JSON data describing table parameters into a user variable
      set @table1_stats = '
      {
        "records": 123,
        "scan_cost": 123.45,
        "indexes" : [
          {
            "index_name": "IDX1",
            "rec_per_key": [123, 1230, 123000],
            "records_in_range" : [
              {
                "min": "keypart1_value1, keypart2_value",
                "min_flag" : 1,
                "max": "keypart1_value",
                "max_flag" : 0,
                "records" : 12300
              },
              ....
            ]
          }
          ...
        ]
      }';
      

      # Now, create the table. Note the parameters:
      #  ENGINE=mimic_replay 
      #  mimic_target_engine=innodb
      #  mimic_parameters=@table1_stats
      # 
      #  I am not sure if this is the most elegant way to pass 
      #  the needed data to the storage engine but it does what 
      #  is needed and doesn't require changes to the parser.
      # 
      

      CREATE TABLE db1.table1 (
        <here goes table definition>
      ) ENGINE=mimic_replay mimic_target_engine=innodb mimic_parameters=@table1_stats;
      ...
      

      #
      # Section 3.1: needed table data 
      # 
      Data for constant tables and tables using MIN/MAX rewrite
      This can be just a list of rows that the optimizer has read:
      INSERT INTO db1.table2 VALUES
        ('row1', 'row1-data'),
        ...;
      

      If the optimizer has tried to read a row but didn't find it, it is not listed.
      TODO: Is this really fine or we should also track the row reads
      that have returned "row not found", too?

      # Section 4: Create VIEWs if they are needed
      CREATE VIEW view1 AS SELECT ...;
      

      # Section 5: Text of the query for which the optimizer
      #   context was recorded:
      EXPLAIN
      SELECT ... FROM view1 ... WHERE ....;
      

      Replaying the capture

      Just source the above capture file into mariadb client.
      The last command should produce the same query plan that was observed when capturing.

      Attachments

        Issue Links

          Activity

            People

              bsrikanth Srikanth Bondalapati
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.