Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15777

Use inferred IS NOT NULL predicates in the range optimizer

    XMLWordPrintable

    Details

      Description

      Consider an example (inspired by queries generated by Hibernate):

      SELECT * 
      FROM 
        t1
      WHERE 
        t1.subset_id IN (SELECT t1_subsets.id
                         FROM t1_subsets)
      

      The subquery is converted into an inner join (because t1_subsets.id is a primary key).
      ANALYZE output:

      +------+-------------+------------+--------+---------------+---------+---------+------------------+-------+----------+----------+------------+--------------------------+
      | id   | select_type | table      | type   | possible_keys | key     | key_len | ref              | rows  | r_rows   | filtered | r_filtered | Extra                    |
      +------+-------------+------------+--------+---------------+---------+---------+------------------+-------+----------+----------+------------+--------------------------+
      |    1 | PRIMARY     | t1         | ALL    | t1_subset_id  | NULL    | NULL    | NULL             | 50396 | 49999.00 |   100.00 |       0.00 | Using where              |
      |    1 | PRIMARY     | t1_subsets | eq_ref | PRIMARY       | PRIMARY | 4       | j22.t1.subset_id |     1 |     NULL |   100.00 |       NULL | Using where; Using index |
      +------+-------------+------------+--------+---------------+---------+---------+------------------+-------+----------+----------+------------+--------------------------+
      

      t1 has only a few rows which have a non-NULL value for t1.subset_id.
      This is visible in r_filtered column: Early NULLs filtering optimization adds a NOT NULL condition which filters out rows with NULLs.

      A piece of ANALYZE FORMAT=JSON output to confirm this:

          "table": {
            "table_name": "t1",
            "access_type": "ALL",
            "possible_keys": ["t1_subset_id"],
            "r_loops": 1,
            "rows": 50396,
            "r_rows": 49999,
            "r_total_time_ms": 290.22,
            "filtered": 100,
            "r_filtered": 0,
            "attached_condition": "t1.subset_id is not null"
          },
      

      However, range optimizer is not able to make use of "Early NULLs filtering". We could have used t1_subset_id index to construct range access, but we dont

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              igor Igor Babaev
              Reporter:
              psergey Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: