Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18144

ANALYZE for statement support for PK filters

    XMLWordPrintable

    Details

      Description

      This task is about adding ANALYZE [FORMAT=JSON] support for PK filtering
      (MDEV-16188).

      User interface

      Examples of PK filtering for EXPLAIN and EXPLAIN FORMAT=JSON:

      id     1
      select_type    SIMPLE
      table  lineitem
      type   range|filter
      possible_keys  PRIMARY,i_l_shipdate,i_l_receiptdate,i_l_orderkey,i_l_orderkey_quantity
      key    i_l_receiptdate|i_l_shipdate
      key_len        4|4
      ref    NULL
      rows   6 (0%)
      Extra  Using index condition; Using where; Using filter
      

      The filter shows rows="6 (0%)"

      in ANALYZE, there will be "r_rows" column.
      It should match the rows column, that is, display "nnn (mmm%)" where mmm% - is
      the observed selectivity of the filter. If the filter was discared, we should
      indicate that in some way (exactly what to print is to be determined).

      With FORMAT=JSON we get more date:

      +      "rowid_filter": {
      +        "range": {
      +          "key": "i_l_shipdate",
      +          "used_key_parts": ["l_shipDATE"]
      +        },
      +        "rows": 8,
      +        "selectivity_pct": 0.1332
      +      },
      

      ANALYZE FORMAT=JSON output should include:

      • r_rows for the quick select that was used to populate the filter.
      • if the filter was discarded (because it was too big), this should be indicated.
      • The total time to populate (i.e. build) the filter should be included as well.
      • selectivity_pct should get "r_selectivity_pct" which will show the observed filter selectivity.
      • (I am not sure if we should include the time it took to make checks in the filter).

      Implementation

      class Explain_rowid_filter should get a pointer to a "Rowid filter tracker
      object" (which will exist for all queries, both ANALYZE or not). (just like
      e.g. Explain_update has a pointer to Filesort_tracker object).

      Filesort_tracker is a good example of a complex tracker object.
      There should be a Rowid_filter_tracker object with information about the query
      execution. It should have:

      • a member to track the time to populate the filter.
      • a member to track the time/#rows for table access while populating the
        filter.
      • A member which indicates whether filter construction finished or was aborted.
      • (Optionally) A member which indicates how big the buffer size was.
      • Two counters: one counts how many times the filter was invoked, the other
        counts how many times the filter has filter out the record. Dividing one by
        another produces r_selectivity_pct.

      Tracking the time to build the filter

      See Explain_select::time_tracker and its use in JOIN::exec.

      Tracking table access and r_rows for building the filter.

      At the moment, storage engine objects (like ha_innobase) have "tracker" objects
      inside them (see handler::tracker).

      The question is: does MDEV-16188 use the same handler object to build the
      filter (phase #1) and to perform the "primary" table access by the main query
      (phase #2) ?

      If yes: we will need to switch handler::tracker objects after building the
      filter has been finished (as we want time/#rows for phase #1 and phase #2 to be
      accounted separately)

      If no: then, we'll need to set the secondary handler object's h->tracker to
      point to our tracker.

      Printing

      void Explain_rowid_filter::print_explain_json(Explain_query *query, 
                                                    Json_writer *writer,
                                                    bool is_analyze)
      

      has "is_analyze" argument. This controls whether analyze data should be
      printed.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              shagalla Galina Shalygina (Inactive)
              Reporter:
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.