[MDEV-18478] ANALYZE for statement should show selectivity of pushed index condition Created: 2019-02-05  Updated: 2024-02-08

Status: In Testing
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.2, 10.3, 10.4
Fix Version/s: 11.4

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Lena Startseva
Resolution: Unresolved Votes: 0
Labels: optimizer-easy, optimizer-feature

Issue Links:
Relates
relates to MDEV-33054 ANALYZE FORMAT=JSON: rowid_filter.r_... In Progress

 Description   

(filing his is a followup for the optimizer call on January, 29th)

The EXPLAIN output has:

  • rows - expected # of total rows enumerated
  • filtered - expected condition selectivity

Additional members from ANALYZE:

  • r_rows - shows how many rows were returned to the SQL layer (that is, after pushed index condition was applied (the value will be lower))
  • r_filtered - shows the selectivity of the condition that is applied at the SQL layer (that is, the selectivity of pushed index condition is not counted, the value will be higher)

Considerations for fixing this:

  • ANALYZE FORMAT=JSON could show #rows enumerated, and both selectivities
  • For tabular ANALYZE, we are limited to two numbers.
  • rows vs r_rows and filtered vs r_filtered should be apples-to-apples comparisons.

Implementation

Current handler statistics/counters

There are multiple ways different statistics about actions inside handler make it to ANALYZE output:

Option 1: ha_handler_stats.h:class ha_handler_stats

This was introduced recently to produce r_engine_stats.
It seems, there is no issue if this structure had another member which handler_index_cond_check would increment...

Option 2: handler->tracker approach

This approach is used by a lot of ANALYZE FORMAT=JSON counters.

Read access time is counted using this member of class handler:

  /* ANALYZE time tracker, if present */
  Exec_time_tracker *tracker;

It points to a member of class Explain_table_access:

  Table_access_tracker tracker;

This works, but do we want to add another member like 'handler::icp_tracker' ?

Option 3: Rowid Filter's approach

Storage engine code calls handler_rowid_filter_check() which calls Range_rowid_filter::check() through

h->pushed_rowid_filter->check(...) 

call. Range_rowid_filter::check increments Rowid_filter::tracker.

The key part is that class handler has

  Rowid_filter *pushed_rowid_filter;

which is the context and has the counter.

For ICP, we have instead:

  Item *pushed_idx_cond;
  uint pushed_idx_cond_keyno;  /* The index which the above condition is for */

messing with cleaning/restoring pushed_idx_cond might be hard.

Which approach to use

*Option1* seems the best choice...

Output

TODO.
Considerations:

One should be able to compare rows and r_rows and it should be apples-to-apples comparison.
rows is number-of-rows-to-enumerate: before rowid filtering, before ICP.
so, r_rows should be the same.

There is also filtered which is... not as clearly defined? Should it be:

  • %of rows left after any kind of filtering?
  • %of rows left after checking the attached_condition?
    (TODO: check what happens when we have an extra pushed-index-condition which selectivity is known to the optimizer).
    (TODO: check what happens when using a rowid filter. Is its selectivity included in filtered% or not?)

In any case, we need something like r_icp_filtered.
The question is, should r_filtered include r_icp_filtered / 100% as a multiplier or not.



 Comments   
Comment by Sergei Petrunia [ 2023-10-22 ]

Notes:

There is another similar optimization, Rowid Filtering (https://mariadb.com/kb/en/rowid-filtering-optimization/) . It also works by pushing a check into the storage engine.

It does print its selectivity into ANALYZE output, example:
https://github.com/MariaDB/server/blob/11.3/mysql-test/main/rowid_filter.result#L145

It's possible that both Rowid Filtering and Index Condition Pushdown are used at the same time.
Need to make sure the output numbers have the "right" values in all cases.

Comment by Sergei Petrunia [ 2023-11-30 ]

Take-aways from Slack discussion: proceeding to

Milestone#1:

  • Add two counters, "number of rows before ICP check" and "number of rows after ICP check" into ha_handler_stats.
  • Make handler_index_cond_check() increment them.
  • Make sql_explain.cc: Expl*::print_explain_json() functions print r_icp_filtered percentage, next to r_filtered.
Comment by Jason (Inactive) [ 2023-12-05 ]

For Milestone 1, proposed patch in https://github.com/MariaDB/server/tree/bb-11.4-MDEV-18478.

This is an example of the new output added to ANALYZE FORMAT=JSON (which can be seen in context in the test cases in the link above):

...
          "r_index_condition": {
            "r_rows_idx": 3,
            "r_icp_filtered": 66.66666667
          },
...

The meanings are:

  • r_rows_idx: number of rows for which the index condition is tried
  • r_icp_filtered: percentage of r_rows_idx that survive the pushed index condition

I'm open to suggestions for better names for the counters.

PR coming shortly for psergei.
fyi, adding these runtime counters affected several testcase .result files. When/if we want to change the meaning of the optimizer estimate counters, it will affect many more tests.

Comment by Sergei Petrunia [ 2023-12-07 ]

Some review notes:

  • The output form needs to be changed but we need to make up our mind what it should be.
  • The patch doesn't have code to print ICP stats for single-table UPDATE/DELETE ... but UPDATE/DELETE don't use ICP?
  • Question about r_rows_idx : it's an "overall total", doesn't divide by r_loops (like r_rows or rows do)
Comment by Sergei Petrunia [ 2023-12-07 ]

How does ANALYZE output look for Rowid Filters?

    "table": {
      "table_name": "lineitem",
      "access_type": "range",
      ...
      "rowid_filter": {
        ...
        "r_lookups": 349,
        "r_selectivity_pct": 9.742120344,
      },

r_lookups is how many lookups were made (think "rows before the filter")
r_selectivity_pct is the percentage of rows left after the filter.

Comment by Sergei Petrunia [ 2023-12-07 ]

an example from rowid_filter.result:

        "table": {
          "table_name": "lineitem",
          "access_type": "range",
          "rowid_filter": {
            ...
            "r_lookups": 349,
            "r_selectivity_pct": 9.742120344,
          },
          "r_rows": 34,
 
          "r_index_condition": {
            "r_rows_idx": 349,
            "r_icp_filtered": 100
          },
 
          "filtered": 8.476269722,
          "r_filtered": 100,
        }

This shows that

  • we start from 349 index records
  • make index condition check, r_icp_filtered=100
  • apply rowid filter (r_lookups=349, r_selectivity_pct=9.74). 349*0.0974=34
  • This gives us r_rows=34 ...
Comment by Jason (Inactive) [ 2023-12-07 ]

For:

The patch doesn't have code to print ICP stats for single-table UPDATE/DELETE ... but UPDATE/DELETE don't use ICP?

It seems that pushed conditions are supported in sql_update.cc and sql_delete.cc under some code related to the spider storage engine.

  if ((table->file->ha_table_flags() & HA_CAN_DIRECT_UPDATE_AND_DELETE) &&
      !has_triggers && !binlog_is_row && !returning &&
      !table_list->has_period())
  {
    table->mark_columns_needed_for_delete();
    if (!table->check_virtual_columns_marked_for_read())
    {
      DBUG_PRINT("info", ("Trying direct delete"));
      bool use_direct_delete= !select || !select->cond;
      if (!use_direct_delete &&
          (select->cond->used_tables() & ~RAND_TABLE_BIT) == table->map)
      {
        DBUG_ASSERT(!table->file->pushed_cond);
        if (!table->file->cond_push(select->cond))
        {
          use_direct_delete= TRUE;
          table->file->pushed_cond= select->cond;
        }
      }

It seems to be not generally possible. I tested it by adding DBUG_ASSERT(1) and running mtr, which did not hit at all.

We can choose to:
1. support pushed conditions for update/delete in ANALYZE, but we will likely not be able to add a unit test for it.
2. don't support it in ANALYZE, but if the case happens for a user of spider, then the pushed condition will not be explained

Comment by Jason (Inactive) [ 2023-12-07 ]

Question about r_rows_idx : it's an "overall total", doesn't divide by r_loops (like r_rows or rows do)

This is a good question. It appears to me that the rowid_filter r_lookups is also a overall total and is also not divided by r_loops.
It's easy enough to divide r_rows_idx by r_loops to make it consistent with r_rows, but it would be inconsistent with r_lookups. Which way provides more useful information?

Comment by Jason (Inactive) [ 2023-12-08 ]

For

How does ANALYZE output look for Rowid Filters?

and

an example from rowid_filter.result:

I'm not sure if you are suggesting a some kind of change? Or are you just confirming that the output provides a useful analysis for how many rows are filtered at each step? If it is the latter, then I agree with the analysis, and there is a similar example with ICP and rowid filter in mysql-test/main/analyze_format_json.test. The comments in the added test indicate that there is a case with both ICP and rowid filter, and indicates how much filtering happened for each step.

Comment by Sergei Petrunia [ 2023-12-08 ]

It seems that pushed conditions are supported in sql_update.cc and sql_delete.cc under some code related to the spider storage engine.

This is "Table condition pushdown" which is a different feature from Index Condition Pushdown. It's outside of the scope of this MDEV.

Question about r_rows_idx : it's an "overall total", doesn't divide by r_loops (like r_rows or rows do)
This is a good question. It appears to me that the rowid_filter r_lookups is also a overall total and is also not divided by r_loops.

I think both should be fixed to be comparable with r_rows, then. Let me continue to look at this to make sure the opinion doesn't change...

...
I'm not sure if you are suggesting a some kind of change? Or are you just confirming that the output provides a useful analysis for how many rows are filtered at each step?

Sorry hit submit before finishing the comment. I was just checking that the output numbers make sense.

Comment by Sergei Petrunia [ 2023-12-08 ]

Note: the patch can do division-by-zero if there were not ICP checks.
We see 0 in the output, because Json_writer::add_double(double val) uses my_snprintf("%lg") which prints -NAN as 0....

IIRC, division by zero is not harmless, it used to caused errors on some platforms in BB.

Comment by Sergei Petrunia [ 2023-12-08 ]

Tried to do a sketch of desired output:
https://gist.github.com/spetrunia/c6bea04bb76dc7e8c4dba953e8e637db

Comment by Jason (Inactive) [ 2023-12-08 ]

I've updated the patch to address divide-by-zero and dividing r_rows_idx and r_lookups by r_loops.

I didn't do it in this patch, but you may want to consider making r_rows_idx and r_lookups (and r_rows, too?) of type double instead of an integer type, because the result of the division cast to an integer truncates the digits after the decimal point, which for small numbers can look misleading (especially when the result becomes 0).

Comment by Sergei Petrunia [ 2023-12-11 ]

jasoncu, thanks for the updated patch.. I'll continue from here...

Comment by Sergei Petrunia [ 2023-12-11 ]

Difference of the above sketch from the current code:

  • r_rows shows "number of rows before any kind of filtering was applied".
  • r_icp_filtered is added.
  • rowid_filter.r_lookups is not a "total" it is "per each access method invoication"
  • r_cond_filtered is added. This is observed selectivity of the "attached_condition".
  • r_filtered is now "combined selectivity".

Plus, members are re-ordered so members describing earlier phases come next.
(This makes no difference as far as JSON data model is concerned - JSON object members
are not ordered, but it's nicer to read).

Comment by Sergei Petrunia [ 2023-12-18 ]

Implemented in three patches: bb-11.4-MDEV-18478-v2:

EDIT:

commit ea03218f359bd18f142edfbcc4f94efd2f0c096f (HEAD -> bb-11.4-MDEV-18478-v2, origin/bb-11.4-MDEV-18478-v2, origin/HEAD)

commit d0160974ae1668f8a08e34cd9c29c86aff1d573a (HEAD -> bb-11.4-MDEV-18478-v2, origin/bb-11.4-MDEV-18478-v2, origin/HEAD)
Author: Sergei Petrunia <sergey@mariadb.com>
Date:   Mon Dec 18 16:01:02 2023 +0300
 
    MDEV-33054: ANALYZE FORMAT=JSON: rowid_filter.r_lookups should be average, not total
    
    Make Explain_rowid_filter::print_explain_json() print "r_lookups" as
    per-loop average, not total.
    
    Per-loop average is comparable with parent JSON node's (that is, table's)
    "rows" and "r_rows", which are per-loop averages.

commit a3a334772275820510d28948213a19442e54e15e
Author: Sergei Petrunia <sergey@mariadb.com>
Date:   Mon Dec 18 13:56:14 2023 +0300
 
    MDEV-18478 ANALYZE for statement should show selectivity of pushed index condition
    
    Part#2: Make the printed r_ values in JSON output consistent. After this
    patch, ANALYZE output has:
    
    - r_index_rows (NEW) - Observed number of rows before ICP or Rowid Filtering
      checks. This is a per-scan average. like r_rows and "rows" are.
    
    - r_rows (AS BEFORE) - Observed number of rows after ICP and Rowid Filtering.
    
    - r_icp_filtered (NEW in MDEV-18478) - Observed selectivity of ICP condition.
    - (observed selectivity of Rowid Filter is in $.rowid_filter.r_selectivity_pct)
    
    - r_condition_filtered (NEW) - Observed selectivity of "attached_condition".
      This number was previously printed in r_filtered.
    
    - r_filtered - Observed combined selectivity: fraction of rows left after applying
      ICP condition, Rowid Filter, and attached_condition. This is now comparable
      with "filtered".

commit 16726ea7b1311a1ee551a498d68f3e4e8adee54d
Author: Sergei Petrunia <sergey@mariadb.com>
Date:   Mon Dec 11 16:06:59 2023 +0300
 
    MDEV-18478 ANALYZE for statement should show selectivity of pushed index condition
    
    (Based on the original patch by Jason Cu)
    
    Part #1:
    - Add ha_handler_stats::{icp_attempts,icp_match}, make
      handler_index_cond_check() increment them.
    - ANALYZE FORMAT=JSON now prints r_icp_filtered based on these counters.

Comment by Oleg Smirnov [ 2023-12-26 ]

1. Comments on the 1st commit are left at GitHub.

2. Comments on the 2nd commit are also left at GitHub.

In my opinion, attribute names are not quite descriptive, one can't read the ANALYZE output without documentation. I'd suggest renaming to more verbose names, like these:
r_index_rows -> r_index_rows_before filtering
r_condition_filtered -> r_filtered_after_attached_condition
r_filtered -> r_selectivity_after_filters_and_conds or r_filtered_final
r_icp_filtered -> r_filtered_after_icp

Also I'd explicitly indicate that values are expressed in percents either by adding "_pct" suffix (like it's done for "$.rowid_filter.r_selectivity_pct") or the "%" sign after values (which is more natural and short)

3. Comments on the 3rd commit are here.

Comment by Sergei Petrunia [ 2023-12-28 ]

oleg.smirnov, we can't easily rename r_filtered as it has been that way for many years already. I don't think it's realistic to expect ANALYZE output to be comprehensible without documentation, there is just too much detail. (And if one doesn't read documentation, how do they know what is ICP or Rowid Filter anyway?)

On the question of typecasts: some (most?) of them are actually necessary:

https://buildbot.mariadb.org/#/builders/234/builds/25143/steps/8/logs/stdio

    93>D:\Buildbot\amd64-windows\build\sql\sql_explain.cc(2090,17): error C2220: the following warning is treated as an error [D:\Buildbot\amd64-windows\build\sql\sql.vcxproj]
D:\Buildbot\amd64-windows\build\sql\sql_explain.cc(2090,17): error C2220:     double loops= tracker.get_loops(); [D:\Buildbot\amd64-windows\build\sql\sql.vcxproj]
D:\Buildbot\amd64-windows\build\sql\sql_explain.cc(2090,17): error C2220:                 ^ [D:\Buildbot\amd64-windows\build\sql\sql.vcxproj]
    93>D:\Buildbot\amd64-windows\build\sql\sql_explain.cc(2090,17): warning C4244: 'initializing': conversion from 'ha_rows' to 'double', possible loss of data [D:\Buildbot\amd64-windows\build\sql\sql.vcxproj]

Comment by Sergei Petrunia [ 2023-12-28 ]

oleg.smirnov, I've addressed all input except the request for renames: https://github.com/MariaDB/server/pull/2970

Comment by Oleg Smirnov [ 2023-12-30 ]

Added some more comments to the PR.

Comment by Sergei Petrunia [ 2024-01-04 ]

(Got approval on the pull request).

Comment by Sergei Petrunia [ 2024-01-05 ]

(lstartseva, we need testing for this, but let me provide at least documentation first)

Comment by Sergei Petrunia [ 2024-01-10 ]

Documentation:
https://mariadb.com/kb/en/analyze-interpreting-execution-stats-for-index-based-access-methods/

Generated at Thu Feb 08 08:44:23 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.