[MDEV-36332] Disable rowid filter + ICP for scans in either direction - Jira

Details

Type: New Feature
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: 12.1
Component/s: Optimizer
Labels:
None

Description

(All file and line references below are with respect to git sha 38e420ba6e84ef6efc9c54b04ebd4dbc95dc3cae which was 11.4 at the time of writing.)

Range filtering is cost-based. The decision to filter is made exclusively on the cost by using selectivity and it's made during optimization, to choose the best join ordering. However, at this point in optimization, ICP is not taken into account. First, we choose some join order and then later we figure out if an index condition can be pushed. If yes, then we push it, but that is downstream of join ordering. At the same time, the selectivity of ICP can be large, so using the rowid range filters together with a large ICP selectivity might cause a slowdown: we spend time building the filter but we may not gain anything or just a little. The original approach was that if ICP is used, and although we decide we use a filter, we cancel this usage. So filtering can never be used together with ICP to access the table. Since the original implementation, other changes over time have now permitted this usage, which is not desirable. At the same time, we are not certain what is more optimal at this point: to cancel the ICP or to cancel the rowid filter.

Here is an example scenario in which rowid filtering and ICP both are used:

create table t1 (a int, b int);
insert into t1 (a, b) values
(1, 1000),
(2, 2000),
(3, 3000),
(4, 4000),
(5, 5000),
(6, 6000),
(7, 7000),
(8, 8000),
(9, 9000),
(10, 10000);
create index t1_a on t1 (a);
create index t1_b on t1 (b);
set session optimizer_where_cost=10;
analyze format=json select * from t1 where t1.a between 3 and 8 and t1.b between 4000 and 4100 order by t1.b\G

If we place a breakpoint at handler.cc:7144, just before we check the pushed rowid filter, then we can see the following abriged stack trace (we can also see that h->pushed_idx_cond and h->pushed_rowid_filter both are set at this point):

(lldb) bt

* thread #8, stop reason = breakpoint 1.1

  * frame #0: 0x0000000102ab3404 mariadbd`handler_rowid_filter_check(h_arg=0x0000000156139ca8) at handler.cc:7146:3

    frame #1: 0x000000010382bcc4 mariadbd`row_search_idx_cond_check(mysql_rec="\xfb", prebuilt=0x000000015613d498, rec="\x80", offsets=0x000000016d9da3b0) at row0sel.cc:4104:34

    frame #2: 0x00000001038296ac mariadbd`row_search_mvcc(buf="\xfb", mode=PAGE_CUR_GE, prebuilt=0x000000015613d498, match_mode=0, direction=0) at row0sel.cc:5493:10

    frame #3: 0x000000010360aa5c mariadbd`ha_innobase::index_read(this=0x0000000156139ca8, buf="\xfb", key_ptr="", key_len=5, find_flag=HA_READ_KEY_OR_NEXT) at ha_innodb.cc:9121:3

    frame #4: 0x000000010338dec4 mariadbd`handler::index_read_map(this=0x0000000156139ca8, buf="\xfb", key="", keypart_map=1, find_flag=HA_READ_KEY_OR_NEXT) at handler.h:4190:12

    frame #5: 0x0000000102aa6230 mariadbd`handler::ha_index_read_map(this=0x0000000156139ca8, buf="\xfb", key="", keypart_map=1, find_flag=HA_READ_KEY_OR_NEXT) at handler.cc:3792:3

    frame #6: 0x0000000102ab2c9c mariadbd`handler::read_range_first(this=0x0000000156139ca8, start_key=0x0000000156139df8, end_key=0x0000000156139e18, eq_range_arg=false, sorted=true) at handler.cc:6949:13

    frame #7: 0x0000000103140a08 mariadbd`handler::multi_range_read_next(this=0x0000000156139ca8, range_info=0x000000016d9dae48) at multi_range_read.cc:590:15

    frame #8: 0x0000000103140ba8 mariadbd`Mrr_simple_index_reader::get_next(this=0x000000015613a340, range_info=0x000000016d9dae48) at multi_range_read.cc:627:32

    frame #9: 0x0000000103143f30 mariadbd`DsMrr_impl::dsmrr_next(this=0x000000015613a1f0, range_info=0x000000016d9dae48) at multi_range_read.cc:1731:26

    frame #10: 0x00000001036242cc mariadbd`ha_innobase::multi_range_read_next(this=0x0000000156139ca8, range_info=0x000000016d9dae48) at ha_innodb.cc:20301:18

    frame #11: 0x0000000102d0f4ec mariadbd`QUICK_RANGE_SELECT::get_next(this=0x0000000155e14760) at opt_range.cc:13276:5

    frame #12: 0x0000000102d3cc64 mariadbd`rr_quick(info=0x000000015bcc4b10) at records.cc:402:34

    frame #13: 0x0000000102d0e4c0 mariadbd`READ_RECORD::read_record(this=0x000000015bcc4b10) at records.h:81:30

    frame #14: 0x0000000102f42398 mariadbd`join_init_read_record(tab=0x000000015bcc4a40) at sql_select.cc:25221:27

    frame #15: 0x0000000102f1e2a0 mariadbd`sub_select(join=0x000000015bcc2278, join_tab=0x000000015bcc4a40, end_of_records=false) at sql_select.cc:24153:12

    frame #16: 0x0000000102f45e28 mariadbd`do_select(join=0x000000015bcc2278, procedure=0x0000000000000000) at sql_select.cc:23667:14

And the corresponding analyze output to confirm:

MariaDB [test]> analyze format=json select * from t1 where t1.a between 3 and 8 and t1.b between 4000 and 4100 order by t1.b\G

*************************** 1. row ***************************

ANALYZE: {

  "query_optimization": {

    "r_total_time_ms": 0.471028634

},

  "query_block": {

    "select_id": 1,

    "cost": 0.012382207,

    "r_loops": 1,

    "r_total_time_ms": 0.194622362,

    "nested_loop": [

        "table": {

          "table_name": "t1",

          "access_type": "range",

          "possible_keys": ["t1_a", "t1_b"],

          "key": "t1_b",

          "key_length": "5",

          "used_key_parts": ["b"],

          "rowid_filter": {

            "range": {

              "key": "t1_a",

              "used_key_parts": ["a"]

},

            "rows": 6,

            "selectivity_pct": 60,

            "r_rows": 6,

            "r_lookups": 1,

            "r_selectivity_pct": 100,

            "r_buffer_size": 4,

            "r_filling_time_ms": 0.082075697

},

          "loops": 1,

          "r_loops": 1,

          "rows": 1,

          "r_rows": 1,

          "cost": 0.012005717,

          "r_table_time_ms": 0.115547909,

          "r_other_time_ms": 0.028011614,

          "r_engine_stats": {

            "pages_accessed": 3

},

          "filtered": 60,

          "r_filtered": 100,

          "index_condition": "t1.b between 4000 and 4100",

          "attached_condition": "t1.a between 3 and 8"

1 row in set (0.001 sec)

Objection

Consider a basic example with join (https://gist.github.com/spetrunia/a2f28cd0573afba20179c2a828ac8d49):

explain

select * from t1, t2

where

  t1.b+1<10 and

  t2.key2=t1.a and

  t2.key1 between 100 and 200;

*************************** 1. row ***************************

           id: 1

  select_type: SIMPLE

        table: t1

         type: ALL

possible_keys: NULL

          key: NULL

      key_len: NULL

          ref: NULL

         rows: 100

        Extra: Using where

*************************** 2. row ***************************

           id: 1

  select_type: SIMPLE

        table: t2

         type: ref|filter

possible_keys: key1,key2

          key: key2|key1

      key_len: 5|5

          ref: j1.t1.a

         rows: 19 (1%)

        Extra: Using where; Using rowid filter

Here, we build the rowid filter once per join execution, and then it is used during all index lookups to table t2. We expect 100 lookups, each scanning 19 rows).

Now, table t1 has "Using where", the condition t1.b+1<10. We don't know its selectivity. If this condition is selective, it will reduce the number of index lookups made to table t2, to the point where using Rowid Filter for table t2 will no longer bring benefit.

This MDEV requests that Rowid Filter is not used with Pushed Index condition. Should Rowid Filter also be not used when it is used with ref access and the join plan has "Using where" conditions which may reduce the number of times the Rowid Filter is used?

Attachments

Activity

Dave Gosselin added a comment - 3 days ago

Relevant follow-up question from psergei: "I'm wondering if the logic that ICP should not work with rowid filter can be also extended that joins with "using where" should not work with rowid filter?"

Dave Gosselin added a comment - 3 days ago Relevant follow-up question from psergei : "I'm wondering if the logic that ICP should not work with rowid filter can be also extended that joins with "using where" should not work with rowid filter?"

People

Assignee:: Unassigned

Reporter:: Dave Gosselin

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 4 days ago 14:17

Updated:: 3 days ago 14:40

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server