Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20109

Optimizer ignores distinct key created for materialized semi-join subquery when searching for best execution plan

    Details

      Description

      Consider the following database tables:

      create table t0 (a int); 
      insert into t0 values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
      create table t1 (a int);
      insert into t1 select (t0.a-1)*10 + (t.a-1) + 1 from t0 t, t0;
      create table t2 (a int);
      insert into t2 values (12), (88), (47), (33), (28);
      

      The following query uses very inefficient execution plan without look-ups into the materialized subquery:

      select * from t1 where a in (select a from t2 group by a); 
      

      MariaDB [test]> explain extended select * from t1 where a in (select a from t2 group by a);
      +------+--------------+-------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      | id   | select_type  | table       | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                           |
      +------+--------------+-------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      |    1 | PRIMARY      | <subquery2> | ALL  | distinct_key  | NULL | NULL    | NULL |    5 |   100.00 |                                                 |
      |    1 | PRIMARY      | t1          | ALL  | NULL          | NULL | NULL    | NULL |  100 |   100.00 | Using where; Using join buffer (flat, BNL join) |
      |    2 | MATERIALIZED | t2          | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 |                                                 |
      +------+--------------+-------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      3 rows in set, 1 warning (0.00 sec)
       
      MariaDB [test]> show warnings;
      +-------+------+------------------------------------------------------------------------------------------------------------------+
      | Level | Code | Message                                                                                                          |
      +-------+------+------------------------------------------------------------------------------------------------------------------+
      | Note  | 1003 | select `test`.`t1`.`a` AS `a` from `test`.`t1` semi join (`test`.`t2`) where (`test`.`t1`.`a` = `test`.`t2`.`a`) |
      +-------+------+------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.00 sec)
      

      The same problem we observe for the mergeable semi-join subquery

      MariaDB [test]> explain extended select * from t1 where a in (select a from t2);
      +------+--------------+-------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      | id   | select_type  | table       | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                           |
      +------+--------------+-------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      |    1 | PRIMARY      | <subquery2> | ALL  | distinct_key  | NULL | NULL    | NULL |    5 |   100.00 |                                                 |
      |    1 | PRIMARY      | t1          | ALL  | NULL          | NULL | NULL    | NULL |  100 |   100.00 | Using where; Using join buffer (flat, BNL join) |
      |    2 | MATERIALIZED | t2          | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 |                                                 |
      +------+--------------+-------------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
      

      This becomes a real problem when t1 and t2 are 100 times bigger.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                igor Igor Babaev
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: