Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36332

Disable rowid filter + ICP for scans in either direction

Details

    • New Feature
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 12.1
    • Optimizer
    • None

    Description

      (All file and line references below are with respect to git sha 38e420ba6e84ef6efc9c54b04ebd4dbc95dc3cae which was 11.4 at the time of writing.)

      Range filtering is cost-based. The decision to filter is made exclusively on the cost by using selectivity and it's made during optimization, to choose the best join ordering. However, at this point in optimization, ICP is not taken into account. First, we choose some join order and then later we figure out if an index condition can be pushed. If yes, then we push it, but that is downstream of join ordering. At the same time, the selectivity of ICP can be large, so using the rowid range filters together with a large ICP selectivity might cause a slowdown: we spend time building the filter but we may not gain anything or just a little. The original approach was that if ICP is used, and although we decide we use a filter, we cancel this usage. So filtering can never be used together with ICP to access the table. Since the original implementation, other changes over time have now permitted this usage, which is not desirable. At the same time, we are not certain what is more optimal at this point: to cancel the ICP or to cancel the rowid filter.

      Here is an example scenario in which rowid filtering and ICP both are used:

      create table t1 (a int, b int);
      insert into t1 (a, b) values
         (1, 1000),
         (2, 2000),
         (3, 3000),
         (4, 4000),
         (5, 5000),
         (6, 6000),
         (7, 7000),
         (8, 8000),
         (9, 9000),
         (10, 10000);
      create index t1_a on t1 (a);
      create index t1_b on t1 (b);
      set session optimizer_where_cost=10;
      analyze format=json select * from t1 where t1.a between 3 and 8 and t1.b between 4000 and 4100 order by t1.b\G
      

      If we place a breakpoint at handler.cc:7144, just before we check the pushed rowid filter, then we can see the following abriged stack trace (we can also see that h->pushed_idx_cond and h->pushed_rowid_filter both are set at this point):

      (lldb) bt
      * thread #8, stop reason = breakpoint 1.1
        * frame #0: 0x0000000102ab3404 mariadbd`handler_rowid_filter_check(h_arg=0x0000000156139ca8) at handler.cc:7146:3
          frame #1: 0x000000010382bcc4 mariadbd`row_search_idx_cond_check(mysql_rec="\xfb", prebuilt=0x000000015613d498, rec="\x80", offsets=0x000000016d9da3b0) at row0sel.cc:4104:34
          frame #2: 0x00000001038296ac mariadbd`row_search_mvcc(buf="\xfb", mode=PAGE_CUR_GE, prebuilt=0x000000015613d498, match_mode=0, direction=0) at row0sel.cc:5493:10
          frame #3: 0x000000010360aa5c mariadbd`ha_innobase::index_read(this=0x0000000156139ca8, buf="\xfb", key_ptr="", key_len=5, find_flag=HA_READ_KEY_OR_NEXT) at ha_innodb.cc:9121:3
          frame #4: 0x000000010338dec4 mariadbd`handler::index_read_map(this=0x0000000156139ca8, buf="\xfb", key="", keypart_map=1, find_flag=HA_READ_KEY_OR_NEXT) at handler.h:4190:12
          frame #5: 0x0000000102aa6230 mariadbd`handler::ha_index_read_map(this=0x0000000156139ca8, buf="\xfb", key="", keypart_map=1, find_flag=HA_READ_KEY_OR_NEXT) at handler.cc:3792:3
          frame #6: 0x0000000102ab2c9c mariadbd`handler::read_range_first(this=0x0000000156139ca8, start_key=0x0000000156139df8, end_key=0x0000000156139e18, eq_range_arg=false, sorted=true) at handler.cc:6949:13
          frame #7: 0x0000000103140a08 mariadbd`handler::multi_range_read_next(this=0x0000000156139ca8, range_info=0x000000016d9dae48) at multi_range_read.cc:590:15
          frame #8: 0x0000000103140ba8 mariadbd`Mrr_simple_index_reader::get_next(this=0x000000015613a340, range_info=0x000000016d9dae48) at multi_range_read.cc:627:32
          frame #9: 0x0000000103143f30 mariadbd`DsMrr_impl::dsmrr_next(this=0x000000015613a1f0, range_info=0x000000016d9dae48) at multi_range_read.cc:1731:26
          frame #10: 0x00000001036242cc mariadbd`ha_innobase::multi_range_read_next(this=0x0000000156139ca8, range_info=0x000000016d9dae48) at ha_innodb.cc:20301:18
          frame #11: 0x0000000102d0f4ec mariadbd`QUICK_RANGE_SELECT::get_next(this=0x0000000155e14760) at opt_range.cc:13276:5
          frame #12: 0x0000000102d3cc64 mariadbd`rr_quick(info=0x000000015bcc4b10) at records.cc:402:34
          frame #13: 0x0000000102d0e4c0 mariadbd`READ_RECORD::read_record(this=0x000000015bcc4b10) at records.h:81:30
          frame #14: 0x0000000102f42398 mariadbd`join_init_read_record(tab=0x000000015bcc4a40) at sql_select.cc:25221:27
          frame #15: 0x0000000102f1e2a0 mariadbd`sub_select(join=0x000000015bcc2278, join_tab=0x000000015bcc4a40, end_of_records=false) at sql_select.cc:24153:12
          frame #16: 0x0000000102f45e28 mariadbd`do_select(join=0x000000015bcc2278, procedure=0x0000000000000000) at sql_select.cc:23667:14
      

      And the corresponding analyze output to confirm:

      MariaDB [test]> analyze format=json select * from t1 where t1.a between 3 and 8 and t1.b between 4000 and 4100 order by t1.b\G
      *************************** 1. row ***************************
      ANALYZE: {
        "query_optimization": {
          "r_total_time_ms": 0.471028634
        },
        "query_block": {
          "select_id": 1,
          "cost": 0.012382207,
          "r_loops": 1,
          "r_total_time_ms": 0.194622362,
          "nested_loop": [
            {
              "table": {
                "table_name": "t1",
                "access_type": "range",
                "possible_keys": ["t1_a", "t1_b"],
                "key": "t1_b",
                "key_length": "5",
                "used_key_parts": ["b"],
                "rowid_filter": {
                  "range": {
                    "key": "t1_a",
                    "used_key_parts": ["a"]
                  },
                  "rows": 6,
                  "selectivity_pct": 60,
                  "r_rows": 6,
                  "r_lookups": 1,
                  "r_selectivity_pct": 100,
                  "r_buffer_size": 4,
                  "r_filling_time_ms": 0.082075697
                },
                "loops": 1,
                "r_loops": 1,
                "rows": 1,
                "r_rows": 1,
                "cost": 0.012005717,
                "r_table_time_ms": 0.115547909,
                "r_other_time_ms": 0.028011614,
                "r_engine_stats": {
                  "pages_accessed": 3
                },
                "filtered": 60,
                "r_filtered": 100,
                "index_condition": "t1.b between 4000 and 4100",
                "attached_condition": "t1.a between 3 and 8"
              }
            }
          ]
        }
      }
      1 row in set (0.001 sec)
      

      Objection

      Consider a basic example with join (https://gist.github.com/spetrunia/a2f28cd0573afba20179c2a828ac8d49):

      explain
      select * from t1, t2 
      where
        t1.b+1<10 and
        t2.key2=t1.a and 
        t2.key1 between 100 and 200;
      

      *************************** 1. row ***************************
                 id: 1
        select_type: SIMPLE
              table: t1
               type: ALL
      possible_keys: NULL
                key: NULL
            key_len: NULL
                ref: NULL
               rows: 100
              Extra: Using where
      *************************** 2. row ***************************
                 id: 1
        select_type: SIMPLE
              table: t2
               type: ref|filter
      possible_keys: key1,key2
                key: key2|key1
            key_len: 5|5
                ref: j1.t1.a
               rows: 19 (1%)
              Extra: Using where; Using rowid filter
       
      
      

      Here, we build the rowid filter once per join execution, and then it is used during all index lookups to table t2. We expect 100 lookups, each scanning 19 rows).

      Now, table t1 has "Using where", the condition t1.b+1<10. We don't know its selectivity. If this condition is selective, it will reduce the number of index lookups made to table t2, to the point where using Rowid Filter for table t2 will no longer bring benefit.

      This MDEV requests that Rowid Filter is not used with Pushed Index condition. Should Rowid Filter also be not used when it is used with ref access and the join plan has "Using where" conditions which may reduce the number of times the Rowid Filter is used?

      Attachments

        Activity

          Gosselin Dave Gosselin added a comment -

          Relevant follow-up question from psergei: "I'm wondering if the logic that ICP should not work with rowid filter can be also extended that joins with "using where" should not work with rowid filter?"

          Gosselin Dave Gosselin added a comment - Relevant follow-up question from psergei : "I'm wondering if the logic that ICP should not work with rowid filter can be also extended that joins with "using where" should not work with rowid filter?"

          People

            Unassigned Unassigned
            Gosselin Dave Gosselin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.