Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38805

Optimizer Trace Replay: Q1 2026 Dev Sprint Work 2

    XMLWordPrintable

Details

    • Task
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 13.0
    • Optimizer
    • None

    Description

      Store optimizer context separately

      Currently, optimizer context is written as JSON sub-element in the Optimizer Trace.

      This task is about storing it separately.

      Possible choices

      A. Store the captured context in the Optimizer Trace like it's done now.
      B. Store the captured context in a separate JSON document.

      Good: These make it easy to use JSON functions to extract interesting parts of the capture.
      Bad:

      • you need to take special steps to replay the trace.
      • the table DDLs may get quoted and not be easily readable.

      C. Store the capture as SQL file
      Good: Replaying the CREATE TABLE and INSERT (i.e. REPLACE INTO) statements is easy.
      BAD: It's hard to automatically extract something from the trace or modify it.

      The decision

      The captured context should be stored in INFORMATION_SCHEMA table.
      Similar to how INFORMATION_SCHEMA.OPTIMIZER_TRACE is stored.

      Tentative table name: INFORMATION_SCHEMA.OPTIMIZER_CONTEXT .

      Columns:
      INFORMATION_SCHEMA.OPTIMIZER_TRACE has these:

      ST_FIELD_INFO optimizer_trace_info[]=
      {
        Column("QUERY",                             Longtext(65535), NOT_NULL),
        Column("TRACE",                             Longtext(65535), NOT_NULL),
        Column("MISSING_BYTES_BEYOND_MAX_MEM_SIZE", SLong(20),       NOT_NULL),
        Column("INSUFFICIENT_PRIVILEGES",           STiny(1),        NOT_NULL),
        CEnd()
      };
      

      Let OPTIMIZER_CONTEXT have:

        Column("QUERY",                             Longtext(65535), NOT_NULL),
        Column("CONTEXT",                       Longtext(65535), NOT_NULL),
      

      The context field should have SQL statements.

      The OPTIMIZER_CONTEXT.CONTEXT column will have the SQL script with all the context:

      CREATE TABLE t1 ( ... );
      -- in case it was a constant table
      REPLACE INTO t1 VALUES (...);
       
      CREATE TABLE t2 ( ... );
      ...
       
      -- Put the 
      set @context='{ JSON with all the captured calls }';
      set @optimizer_context='context';
       
      EXPLAIN SELECT ... the original query; 
      

      User interface

      The user experience will be like this:

      set optimizer_record_context=ON;
      explain select ... ;
      select context into dumpfile '/tmp/capture.sql' 
      from information_schema.optimizer_context;
      

      And then /tmp/capture.sql can be analyzed and sent to the MariaDB support team.

      (Q: does recording context need to enable recording the trace also?)
      ( For now, can rely on trace being enabled).

      Notes

      The above uses Longtext(65535) for OPTIMIZER_CONTEXT.CONTEXT. This was copied from OPTIMIZER_TRACE.TRACE. This may look like a limit of 64K for the size but apparently we could produce optimizer trace that was larger than that.

      Currently we use Json_writer objects to write into optimizer trace. Now, we'll need to write to a string that will be dumped as

      set @context='{ JSON with all the captured calls }';
      

      Histogram code (in sql/opt_histogram_json.*) uses Json_writer object to write JSON histograms to a string. It uses calls like this:

          writer.start_object();
          writer.add_member("target_histogram_size").add_ull(hist_width);
          writer.add_member("collected_at").add_str(str.ptr());
      

      So these can already write to a string.
      I am not sure if RAII objects support writing somewhere else than optimzier trace, but I suppose this could be easily added .

      Assume that we can keep all the context we're saving in memory as one big string. Like it's done for Optimizer Trace.

      • Note that for Range-checked-for-each record plans, this may not be the case. This is because Range-checked-for-each-record will make O(examined_records) calls to multi_range_read_info_const(). We will need to solve this eventually by stopping capture of calls once the query starts executing.

      Attachments

        Issue Links

          Activity

            People

              bsrikanth Srikanth Bondalapati
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.