Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33792

ANALYZE for statement shows incorrect selectivity of pushed index condition for table with partitions

Details

    Description

      It is expected that for the table without partitions and the table with partitions with the same data set, the values like "r_total_filtered" and "r_icp_filtered" from "aanalyze format=json select ..." will be the same. But in this case this is not so:
      Table without partitions:

      create table t1 (a int, b varchar(10), c int, index(a,b));
      insert into t1(a,b,c) values (1,"abc",100),
                                    (2,"abd",200),(2,"bfd",300),(2,"efg",400),
                                    (3,"abc",410),(3,"def",420),(3,"ghi",430),(3,"jkl",440),
                                    (4,"abc",600),(4,"def",700),(4,"ghi",800),(4,"jkl",900);
       
      analyze table t1;
      analyze format=json select * from t1 where t1.a=2 and t1.b like '%f%';
      drop table t1;
      

      Result:

      ANALYZE
      {
        "query_optimization": {
          "r_total_time_ms": 0.077098789
        },
        "query_block": {
          "select_id": 1,
          "cost": 0.004492149,
          "r_loops": 1,
          "r_total_time_ms": 0.017491363,
          "nested_loop": [
            {
              "table": {
                "table_name": "t1",
                "access_type": "ref",
                "possible_keys": ["a"],
                "key": "a",
                "key_length": "5",
                "used_key_parts": ["a"],
                "ref": ["const"],
                "loops": 1,
                "r_loops": 1,
                "rows": 3,
                "r_index_rows": 3,
                "r_rows": 2,
                "cost": 0.004492149,
                "r_table_time_ms": 0.010259051,
                "r_other_time_ms": 0.005597638,
                "r_engine_stats": {},
                "filtered": 100,
                "r_total_filtered": 66.66666667,
                "index_condition": "t1.b like '%f%'",
                "r_icp_filtered": 66.66666667,
                "r_filtered": 100
              }
            }
          ]
        }
      }
      

      Table with partitions:

      create table t1 (a int, b varchar(10), c int, index(a,b))
      partition by hash(a) partitions 4;
       
      insert into t1(a,b,c) values (1,"abc",100),
                                    (2,"abd",200),(2,"bfd",300),(2,"efg",400),
                                    (3,"abc",410),(3,"def",420),(3,"ghi",430),(3,"jkl",440),
                                    (4,"abc",600),(4,"def",700),(4,"ghi",800),(4,"jkl",900);
      analyze table t1;
      analyze format=json select * from t1 where t1.a=2 and t1.b like '%f%';
      drop table t1;
      

      Result:

      ANALYZE
      {
        "query_optimization": {
          "r_total_time_ms": 0.100843591
        },
        "query_block": {
          "select_id": 1,
          "cost": 0.004492149,
          "r_loops": 1,
          "r_total_time_ms": 0.03108695,
          "nested_loop": [
            {
              "table": {
                "table_name": "t1",
                "partitions": ["p2"],
                "access_type": "ref",
                "possible_keys": ["a"],
                "key": "a",
                "key_length": "5",
                "used_key_parts": ["a"],
                "ref": ["const"],
                "loops": 1,
                "r_loops": 1,
                "rows": 3,
                "r_index_rows": 0,
                "r_rows": 2,
                "cost": 0.004492149,
                "r_table_time_ms": 0.01505605,
                "r_other_time_ms": 0.009645094,
                "r_engine_stats": {},
                "filtered": 100,
                "r_total_filtered": 100,
                "index_condition": "t1.b like '%f%'",
                "r_icp_filtered": 100,
                "r_filtered": 100
              }
            }
          ]
        }
      }
      

      The case was executed on the branch "bb-11.4-MDEV-18478v4MDEV-12404". This branch has changes from MDEV-12404 and MDEV-18478

      Attachments

        Issue Links

          Activity

            Note that in the last example the partitioned and non-partitioned cases use different indexes: diff of ANALYZE outputs:

            I did not check what ICP counters should be in either of the cases (or whether ICP should be used) but I don't immediately see a reason why they would be the same...

            psergei Sergei Petrunia added a comment - Note that in the last example the partitioned and non-partitioned cases use different indexes: diff of ANALYZE outputs: I did not check what ICP counters should be in either of the cases (or whether ICP should be used) but I don't immediately see a reason why they would be the same...

            Is is a problem that the optimizer picked different query plans in partitioned vs non-partitioned case?
            I diff'ed optimizer trace outputs:

            @@ -83,6 +84,12 @@
                         }
                       },
                       {
            +            "prune_partitions": {
            +              "table": "t1",
            +              "used_partitions": "p2"
            +            }
            +          },
            +          {
                         "table_dependencies": [
                           {
                             "table": "t1",
            

            ...

            @@ -161,44 +167,29 @@
                               "chosen_range_access_summary": {
                                 "range_access_plan": {
                                   "type": "range_scan",
            -                      "index": "a",
            -                      "rows": 3,
            -                      "ranges": ["(2) <= (a) <= (2)"]
            +                      "index": "c",
            +                      "rows": 1,
            +                      "ranges": ["(400) <= (c) <= (500)"]
                                 },
            -                    "rows_for_plan": 3,
            -                    "cost_for_plan": 0.00623656,
            +                    "rows_for_plan": 1,
            +                    "cost_for_plan": 0.00424968,
                                 "chosen": true
                               }
                             }
            

            It seems like there's a valid reason for difference - in partitioned case, Partition Pruning optimization has pruned away all partitions except one, which had an effect on the query plan choice.

            psergei Sergei Petrunia added a comment - Is is a problem that the optimizer picked different query plans in partitioned vs non-partitioned case? I diff'ed optimizer trace outputs: @@ -83,6 +84,12 @@ } }, { + "prune_partitions": { + "table": "t1", + "used_partitions": "p2" + } + }, + { "table_dependencies": [ { "table": "t1", ... @@ -161,44 +167,29 @@ "chosen_range_access_summary": { "range_access_plan": { "type": "range_scan", - "index": "a", - "rows": 3, - "ranges": ["(2) <= (a) <= (2)"] + "index": "c", + "rows": 1, + "ranges": ["(400) <= (c) <= (500)"] }, - "rows_for_plan": 3, - "cost_for_plan": 0.00623656, + "rows_for_plan": 1, + "cost_for_plan": 0.00424968, "chosen": true } } It seems like there's a valid reason for difference - in partitioned case, Partition Pruning optimization has pruned away all partitions except one, which had an effect on the query plan choice.
            Gosselin Dave Gosselin added a comment -

            psergei great observations, I didn't account for Partition Pruning optimization. It seems to me that there isn't a bug in this particular case after all.

            Gosselin Dave Gosselin added a comment - psergei great observations, I didn't account for Partition Pruning optimization. It seems to me that there isn't a bug in this particular case after all.

            Ok to push

            lstartseva Lena Startseva added a comment - Ok to push

            Fixed by patches for MDEV-18478.

            psergei Sergei Petrunia added a comment - Fixed by patches for MDEV-18478 .

            People

              psergei Sergei Petrunia
              lstartseva Lena Startseva
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.