Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16374

filtered shows 0 for materilization scan for a semi join, which makes optimizer always pick materialization scan over materialization lookup

    Details

      Description

      The dataset:

      create table t0(a int);
      insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table t1 (a int, b int);
      insert into t1 select A.a + B.a* 10 + C.a * 100,  A.a + B.a* 10 + C.a * 100 from t0 A, t0 B, t0 C;
      

      MariaDB [test]> set @@optimizer_use_condition_selectivity=2;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> explain format=json select * from t1 where a in (select max(a) from t1 group by b);
       
      EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | {
        "query_block": {
          "select_id": 1,
          "table": {
            "table_name": "<subquery2>",
            "access_type": "ALL",
            "possible_keys": ["distinct_key"],
            "rows": 1000,
            "filtered": 0,
            "materialized": {
              "unique": 1,
              "query_block": {
                "select_id": 2,
                "temporary_table": {
                  "table": {
                    "table_name": "t1",
                    "access_type": "ALL",
                    "rows": 1000,
                    "filtered": 100
                  }
                }
              }
            }
          },
          "block-nl-join": {
            "table": {
              "table_name": "t1",
              "access_type": "ALL",
              "rows": 1000,
              "filtered": 100
            },
            "buffer_type": "flat",
            "buffer_size": "256Kb",
            "join_type": "BNL",
            "attached_condition": "t1.a = `<subquery2>`.`max(a)`"
          }
        }
      } |
      
      

      If you look closely filtered=0 for the <subquery2> which means the optimizer predicted that no rows would be read. Due to this , the cost for such a plan would always be less than any other plan.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                varun Varun Gupta
                Reporter:
                varun Varun Gupta
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: