[MDEV-12845] subquery with group by not using index Created: 2017-05-19  Updated: 2020-08-25  Resolved: 2017-06-26

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.5, 10.0, 10.1, 10.1.23, 10.2.5, 10.2
Fix Version/s: 10.2.7

Type: Bug Priority: Major
Reporter: Richard Stracke Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: Compatibility, optimizer

Attachments: File db2.sql.gz    
Issue Links:
Duplicate
duplicates MDEV-13091 "Reference not supported" with nested... Closed
Problem/Incident
causes MDEV-13166 Server crashes in Item_ref::used_tabl... Closed
Relates
relates to MDEV-13162 Condition pushdown into derived table... Open
Sprint: 10.2.7-1

 Description   

Hi,

MariaDB executes subquery with group by without filter
before the derived table is ready for using.

Executing large subselects with group by can take some time
and oracle can filter this type of selects.

Migrations from oracle to mariadb will be easier, if the optimizer can manage this.

CREATE TABLE `t1` (
	`id` INT(11) NOT NULL,
	`amt` INT(11) NULL DEFAULT NULL,
	PRIMARY KEY (`id`),
	INDEX `amt` (`amt`)
);
 
CREATE TABLE `t2` (
	`amt` INT(11) NOT NULL,
	`somestring` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8_unicode_ci',
	INDEX `amt_ind` (`amt`)
);

Inserting some values

INSERT INTO `t2` (`amt`, `somestring`) VALUES (4, 'AtcsBD');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (3, 'Wqjovg');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (2, 'iWjoMy');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (3, 'eoAkem');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (4, '23Dj9r');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (1, 'Im42UO');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (1, 'qbjV4X');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (1, 'BuO8oI');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (2, '6Lf5Fv');
INSERT INTO `t2` (`amt`, `somestring`) VALUES (1, 'udFOm8');
 
 
INSERT INTO `t1` (`id`, `amt`) VALUES (2, 1);
INSERT INTO `t1` (`id`, `amt`) VALUES (3, 1);
INSERT INTO `t1` (`id`, `amt`) VALUES (7, 1);
INSERT INTO `t1` (`id`, `amt`) VALUES (10, 1);
INSERT INTO `t1` (`id`, `amt`) VALUES (1, 2);
INSERT INTO `t1` (`id`, `amt`) VALUES (4, 2);
INSERT INTO `t1` (`id`, `amt`) VALUES (5, 2);
INSERT INTO `t1` (`id`, `amt`) VALUES (9, 2);
INSERT INTO `t1` (`id`, `amt`) VALUES (6, 3);
INSERT INTO `t1` (`id`, `amt`) VALUES (8, 3);

The explain

explain
select *
from t1 a
inner join 
(select * from t2 group by amt) b
 on a.amt = b.amt 
where b.amt = 1;

Output shows no index is using for creating the derived table.

MariaDB [bughunt]> explain select * from t1 a inner join  (select * from t2 group by amt) b  on a.amt = b.amt  where a.amt = 1 \G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: a
         type: ref
possible_keys: amt
          key: amt
      key_len: 5
          ref: const
         rows: 4
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4
        Extra: Using where; Using join buffer (flat, BNL join)
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: t2
         type: ALL
possible_keys: amt_ind
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 10
        Extra: Using where
3 rows in set (0.00 sec)
 
}

I don't know, if it is a bug or feature request, if initially create as a bug,
you can change it, if you want to classify it as a feature request.



 Comments   
Comment by Elena Stepanova [ 2017-05-19 ]

It appears to be fixed in 10.2, even in the mentioned 10.2.5, but the test case does not have enough data to trigger using the index:

-- create the tables and data as described above
 
-- Index is not used:
 
MariaDB [test]> explain
    -> select *
    -> from t1 a
    -> inner join
    -> (select * from t2 group by amt) b
    ->  on a.amt = b.amt
    -> where b.amt = 1;
+------+-------------+------------+------+---------------+------+---------+-------+------+-------------------------------------------------+
| id   | select_type | table      | type | possible_keys | key  | key_len | ref   | rows | Extra                                           |
+------+-------------+------------+------+---------------+------+---------+-------+------+-------------------------------------------------+
|    1 | PRIMARY     | a          | ref  | amt           | amt  | 5       | const |    4 | Using index                                     |
|    1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL  |    4 | Using where; Using join buffer (flat, BNL join) |
|    2 | DERIVED     | t2         | ALL  | amt_ind       | NULL | NULL    | NULL  |   10 | Using where                                     |
+------+-------------+------------+------+---------------+------+---------+-------+------+-------------------------------------------------+
3 rows in set (0.00 sec)

-- Add some more data to t2
 
MariaDB [test]> INSERT INTO t2 SELECT * FROM t2;
Query OK, 10 rows affected (0.03 sec)
Records: 10  Duplicates: 0  Warnings: 0
 
MariaDB [test]> explain select * from t1 a inner join (select * from t2 group by amt) b  on a.amt = b.amt where b.amt = 1;
+------+-------------+------------+------+---------------+---------+---------+-------+------+-------------------------------------------------+
| id   | select_type | table      | type | possible_keys | key     | key_len | ref   | rows | Extra                                           |
+------+-------------+------------+------+---------------+---------+---------+-------+------+-------------------------------------------------+
|    1 | PRIMARY     | a          | ref  | amt           | amt     | 5       | const |    4 | Using index                                     |
|    1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL    | NULL    | NULL  |    8 | Using where; Using join buffer (flat, BNL join) |
|    2 | DERIVED     | t2         | ref  | amt_ind       | amt_ind | 4       | const |    8 |                                                 |
+------+-------------+------------+------+---------------+---------+---------+-------+------+-------------------------------------------------+
3 rows in set (0.00 sec)

Does it look good enough?

Comment by Richard Stracke [ 2017-05-29 ]

Thx Elena for pointing this out.

This means, I don't simplified the issue good enough.

I add a mysqldump with 4 tables (a 250000 rows)

MariaDB [dbs2]> explain          
    -> SELECT `A`.`ID` AS `ID`,`A`.`REF_NO` AS `REF_NO` FROM
    -> 
    -> 
    -> (
    -> SELECT `RD`.`ID` AS `ID`,`RD`.`REF_NO` AS `REF_NO` FROM (((`t4` `RD`
    -> inner JOIN `t3` `RB` ON(`RD`.`ID` = `RB`.`ID` AND `RD`.`REF_NO` = `RB`.`REF_NO`))
    -> inner JOIN `t2` `RRB` ON(`RD`.`ID` = `RRB`.`ID` AND `RD`.`REF_NO` = `RRB`.`REF_NO`))
    -> inner JOIN (
    -> SELECT * FROM `t1` 
    -> GROUP BY `t1`.`REF_NO`) `RS` ON(`RD`.`ID` = `RS`.`ID` AND `RD`.`REF_NO` = `RS`.`REF_NO`))) `A`
    -> 
    -> where ref_no = 'Jn5tEw';
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
| id   | select_type | table      | type   | possible_keys | key     | key_len | ref                       | rows   | Extra                    |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|    1 | PRIMARY     | RD         | ref    | PRIMARY,t4_I1 | t4_I1   | 434     | const                     |      1 | Using where; Using index |
|    1 | PRIMARY     | RB         | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496     | dbs2.RD.ID,dbs2.RD.REF_NO |      1 | Using index              |
|    1 | PRIMARY     | RRB        | ref    | t2_I1         | t2_I1   | 174     | dbs2.RD.ID,const          |      1 | Using where; Using index |
|    1 | PRIMARY     | <derived3> | ref    | key0          | key0    | 496     | dbs2.RD.REF_NO,dbs2.RD.ID |     10 |                          |
|    3 | DERIVED     | t1         | index  | NULL          | t1_I1   | 498     | NULL                      | 241890 | Using index              |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+

But the optimizer should do something like adding the key in the group by clause like this:(if I not oversee something)

MariaDB [dbs2]> explain
    -> SELECT `A`.`ID` AS `ID`,`A`.`REF_NO` AS `REF_NO` FROM
    -> 
    -> 
    -> (
    -> SELECT `RD`.`ID` AS `ID`,`RD`.`REF_NO` AS `REF_NO` FROM (((`t4` `RD`
    -> inner JOIN `t3` `RB` ON(`RD`.`ID` = `RB`.`ID` AND `RD`.`REF_NO` = `RB`.`REF_NO`))
    -> inner JOIN `t2` `RRB` ON(`RD`.`ID` = `RRB`.`ID` AND `RD`.`REF_NO` = `RRB`.`REF_NO`))
    -> inner JOIN (
    -> SELECT * FROM `t1` where ref_no = 'Jn5tEw'
    -> GROUP BY `t1`.`REF_NO`) `RS` ON(`RD`.`ID` = `RS`.`ID` AND `RD`.`REF_NO` = `RS`.`REF_NO`))) `A`
    -> 
    -> where ref_no = 'Jn5tEw';
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+
| id   | select_type | table      | type   | possible_keys | key     | key_len | ref                       | rows | Extra                    |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+
|    1 | PRIMARY     | RD         | ref    | PRIMARY,t4_I1 | t4_I1   | 434     | const                     |    1 | Using where; Using index |
|    1 | PRIMARY     | RB         | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496     | dbs2.RD.ID,dbs2.RD.REF_NO |    1 | Using index              |
|    1 | PRIMARY     | RRB        | ref    | t2_I1         | t2_I1   | 174     | dbs2.RD.ID,const          |    1 | Using where; Using index |
|    1 | PRIMARY     | <derived3> | ref    | key0          | key0    | 496     | dbs2.RD.REF_NO,dbs2.RD.ID |    2 |                          |
|    3 | DERIVED     | t1         | ref    | t1_I1         | t1_I1   | 434     | const                     |    1 | Using where; Using index |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+

Comment by Elena Stepanova [ 2017-06-05 ]

Reproducible as described with the attached data sample and with these queries (the difference is in the highlighted line):

explain          
 SELECT `A`.`ID` AS `ID`,`A`.`REF_NO` AS `REF_NO` FROM
 (
   SELECT `RD`.`ID` AS `ID`,`RD`.`REF_NO` AS `REF_NO` FROM (((`t4` `RD`
   inner JOIN `t3` `RB` ON(`RD`.`ID` = `RB`.`ID` AND `RD`.`REF_NO` = `RB`.`REF_NO`))
   inner JOIN `t2` `RRB` ON(`RD`.`ID` = `RRB`.`ID` AND `RD`.`REF_NO` = `RRB`.`REF_NO`))
   inner JOIN (
   SELECT * FROM `t1` 
 GROUP BY `t1`.`REF_NO`) `RS` ON(`RD`.`ID` = `RS`.`ID` AND `RD`.`REF_NO` = `RS`.`REF_NO`))) `A`
 where ref_no = 'Jn5tEw';

explain
 SELECT `A`.`ID` AS `ID`,`A`.`REF_NO` AS `REF_NO` FROM
 (
   SELECT `RD`.`ID` AS `ID`,`RD`.`REF_NO` AS `REF_NO` FROM (((`t4` `RD`
   inner JOIN `t3` `RB` ON(`RD`.`ID` = `RB`.`ID` AND `RD`.`REF_NO` = `RB`.`REF_NO`))
   inner JOIN `t2` `RRB` ON(`RD`.`ID` = `RRB`.`ID` AND `RD`.`REF_NO` = `RRB`.`REF_NO`))
   inner JOIN (
   SELECT * FROM `t1` where ref_no = 'Jn5tEw'
   GROUP BY `t1`.`REF_NO`) `RS` ON(`RD`.`ID` = `RS`.`ID` AND `RD`.`REF_NO` = `RS`.`REF_NO`))) `A`
 where ref_no = 'Jn5tEw';

Plans (note sufficiently different number of rows):

First plan

+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
| id   | select_type | table      | type   | possible_keys | key     | key_len | ref                       | rows   | Extra                    |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|    1 | PRIMARY     | RD         | ref    | PRIMARY,t4_I1 | t4_I1   | 434     | const                     |      1 | Using where; Using index |
|    1 | PRIMARY     | RB         | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496     | dbs2.RD.ID,dbs2.RD.REF_NO |      1 | Using index              |
|    1 | PRIMARY     | RRB        | ref    | t2_I1         | t2_I1   | 174     | dbs2.RD.ID,dbs2.RD.REF_NO |      1 | Using where; Using index |
|    1 | PRIMARY     | <derived3> | ref    | key0          | key0    | 496     | dbs2.RD.REF_NO,dbs2.RD.ID |     10 |                          |
|    3 | DERIVED     | t1         | index  | NULL          | t1_I1   | 498     | NULL                      | 249742 | Using index              |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+

Second plan

+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
| id   | select_type | table      | type   | possible_keys | key     | key_len | ref                       | rows | Extra                                                     |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
|    1 | PRIMARY     | RD         | ref    | PRIMARY,t4_I1 | t4_I1   | 434     | const                     |    1 | Using where; Using index                                  |
|    1 | PRIMARY     | RB         | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496     | dbs2.RD.ID,dbs2.RD.REF_NO |    1 | Using index                                               |
|    1 | PRIMARY     | RRB        | ref    | t2_I1         | t2_I1   | 174     | dbs2.RD.ID,dbs2.RD.REF_NO |    1 | Using where; Using index                                  |
|    1 | PRIMARY     | <derived3> | ref    | key0          | key0    | 496     | dbs2.RD.REF_NO,dbs2.RD.ID |    2 |                                                           |
|    3 | DERIVED     | t1         | ref    | t1_I1         | t1_I1   | 434     | const                     |    1 | Using where; Using index; Using temporary; Using filesort |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+

I'll leave it to psergey or other optimizer experts to decide whether it's actually supposed to work identically or not.

Comment by Sergei Petrunia [ 2017-06-06 ]

Re-formatted the query a bit:

explain          
SELECT 
  A.ID AS ID,
  A.REF_NO AS REF_NO 
FROM
(
  SELECT 
    RD.ID AS ID,
    RD.REF_NO AS REF_NO 
  FROM 
    ((
      (t4 RD inner JOIN t3 RB ON RD.ID = RB.ID AND RD.REF_NO = RB.REF_NO)
      inner JOIN t2 RRB ON  RD.ID = RRB.ID AND RD.REF_NO = RRB.REF_NO)
      inner JOIN ( SELECT * 
                   FROM t1 
                   GROUP BY t1.REF_NO
                  ) RS ON (RD.ID = RS.ID AND RD.REF_NO = RS.REF_NO)
     )
) A
where ref_no = 'Jn5tEw';

I get the bad plan:

+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
| id   | select_type | table      | type   | possible_keys | key     | key_len | ref                       | rows   | Extra                    |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
|    1 | PRIMARY     | RD         | ref    | PRIMARY,t4_I1 | t4_I1   | 434     | const                     |      1 | Using where; Using index |
|    1 | PRIMARY     | RB         | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496     | dbs2.RD.ID,dbs2.RD.REF_NO |      1 | Using index              |
|    1 | PRIMARY     | RRB        | ref    | t2_I1         | t2_I1   | 174     | dbs2.RD.ID,const          |      1 | Using where; Using index |
|    1 | PRIMARY     | <derived3> | ref    | key0          | key0    | 496     | dbs2.RD.REF_NO,dbs2.RD.ID |     10 |                          |
|    3 | DERIVED     | t1         | index  | NULL          | t1_I1   | 498     | NULL                      | 247255 | Using index              |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+

Comment by Sergei Petrunia [ 2017-06-06 ]

If I just take the child subquery and attach the WHERE clause to it:

explain
  SELECT 
    RD.ID AS ID,
    RD.REF_NO AS REF_NO 
  FROM 
    ((
      (t4 RD inner JOIN t3 RB ON RD.ID = RB.ID AND RD.REF_NO = RB.REF_NO)
      inner JOIN t2 RRB ON  RD.ID = RRB.ID AND RD.REF_NO = RRB.REF_NO)
      inner JOIN ( SELECT * 
                   FROM t1 
                   GROUP BY t1.REF_NO
                  ) RS ON (RD.ID = RS.ID AND RD.REF_NO = RS.REF_NO)
     )
  WHERE RD.REF_NO='Jn5tEw';

I get the good query plan (condition pushdown is working):

+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+
| id   | select_type | table      | type   | possible_keys | key     | key_len | ref                       | rows | Extra                    |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+
|    1 | PRIMARY     | RD         | ref    | PRIMARY,t4_I1 | t4_I1   | 434     | const                     |    1 | Using where; Using index |
|    1 | PRIMARY     | RB         | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496     | dbs2.RD.ID,const          |    1 | Using where; Using index |
|    1 | PRIMARY     | RRB        | ref    | t2_I1         | t2_I1   | 174     | dbs2.RD.ID,const          |    1 | Using where; Using index |
|    1 | PRIMARY     | <derived2> | ref    | key0          | key0    | 496     | dbs2.RD.REF_NO,dbs2.RD.ID |    2 |                          |
|    2 | DERIVED     | t1         | ref    | t1_I1         | t1_I1   | 434     | const                     |    1 | Using where; Using index |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+--------------------------+

Comment by Sergei Petrunia [ 2017-06-06 ]

Looks like some deficiency in condition pushdown, although I do not understand where.

Comment by Igor Babaev [ 2017-06-12 ]

I can reproduce the problem on the following simple tables:

 create table t1 (a int);
insert into t1 values
  (4), (8), (5), (3), (10), (2), (7);
 
create table t2 (b int, c int);
insert into t2 values
  (2,1), (5,2), (2,2), (4,1), (4,3),
  (5,3), (2,4), (4,6), (2,1);

Exclusively for readability of the queries demonstrating the problem I use a view instead of a derived table.

create view v1 as
select b, sum(c) as s from t2 group by b;

The problem appears with the query

select a
from 
( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
where a > 2;
MariaDB [test]> explain format=json
    -> select a
    -> from 
    -> ( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
    -> where a > 2;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "t1",
      "access_type": "ALL",
      "rows": 7,
      "filtered": 100,
      "attached_condition": "t1.a > 2 and t1.a is not null"
    },
    "table": {
      "table_name": "<derived3>",
      "access_type": "ref",
      "possible_keys": ["key0"],
      "key": "key0",
      "key_length": "5",
      "used_key_parts": ["b"],
      "ref": ["test.t1.a"],
      "rows": 2,
      "filtered": 100,
      "materialized": {
        "query_block": {
          "select_id": 3,
          "filesort": {
            "sort_key": "t2.b",
            "temporary_table": {
              "table": {
                "table_name": "t2",
                "access_type": "ALL",
                "rows": 9,
                "filtered": 100
              }
            }
          }
        }
      }
    }
  }
} |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

We see that no condition is pushed into the view.
At the same time if I change the query for

select b
from 
( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
where b > 2;

I see that the condition b > 2 is pushed into the view:

MariaDB [test]> explain format=json
    -> select b
    -> from 
    -> ( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
    -> where b > 2;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "t1",
      "access_type": "ALL",
      "rows": 7,
      "filtered": 100,
      "attached_condition": "t1.a > 2 and t1.a is not null"
    },
    "table": {
      "table_name": "<derived3>",
      "access_type": "ref",
      "possible_keys": ["key0"],
      "key": "key0",
      "key_length": "5",
      "used_key_parts": ["b"],
      "ref": ["test.t1.a"],
      "rows": 2,
      "filtered": 100,
      "materialized": {
        "query_block": {
          "select_id": 3,
          "filesort": {
            "sort_key": "t2.b",
            "temporary_table": {
              "table": {
                "table_name": "t2",
                "access_type": "ALL",
                "rows": 9,
                "filtered": 100,
                "attached_condition": "t2.b > 2"
              }
            }
          }
        }
      }
    }
  }
} |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

The difference for these two queries is that for the second query the condition b > 2 can be pushed as it is, while for the second query the condition a > 2 has to be first transformed into the condition b > 2 using the equality t1.a = v1.b.
Our code is supposed to use equalities when pushing conditions into materialized derived tables / views and the results of derived_cond_pushdown.test show that equalities are really used for the queries from this test.
Apparently there is a flaw in the code that prevents using equalities when pushing conditions of a mergeable derived table.

Comment by Igor Babaev [ 2017-06-17 ]

Let's see in debugger what's going on with the query

select a
from 
( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
where a > 2;

After the merge of the derived table t into the main query we have the query

select `test`.`t1`.`a` AS `a` from `test`.`t1` join `test`.`v1` where `v1`.`b` = `test`.`t1`.`a` and `test`.`t1`.`a` > 2

The function pushdown_cond_for_derived() takes the condition

 `v1`.`b` = `test`.`t1`.`a` and `test`.`t1`.`a` > 2

and first it tries extract from it a new condition that could be pushed into the materialized view v1. Then when having succeeded the function tries to extract from this condition the one that can be pushed into the where clause of v1.
Before extracting a condition that can be pushed into v1 the function checks whether there is such a condition. For this check the function calls
the method check_pushable_cond_for_table() for v1.
The method traverse the condition and says that there is no pushable condition. When coming to the left operand of the conjunct

`test`.`t1`.`a` > 2

it sees that this is an item of the class Item_direct_view_ref. As the function traverses the operands the condition formulas with the method walk() it processes the field item to which the Item_direct_view_ref item refers to. Reaching this field item the check method sees that the field `a` does not belong to v1 and there is no equivalence class that would include the field item. From this it is concludes that `test`.`t1`.`a` > 2 cannot be pushed into v1.
In fact the Item_direct_view_ref item t.a belongs to the equivalence class containing t.a and `v1`.`b`. So the failure return code was premature.
Apparently the walk() method of traversal does not suit when we want to check whether a predicate is pusheable.

Comment by Igor Babaev [ 2017-06-22 ]

A patch to resolve the problem was sent for review.
With this patch I have the following good execution plan:

MariaDB [dbs2]> explain          
    -> SELECT 
    ->   A.ID AS ID,
    ->   A.REF_NO AS REF_NO 
    -> FROM
    -> (
    ->   SELECT 
    ->     RD.ID AS ID,
    ->     RD.REF_NO AS REF_NO 
    ->   FROM 
    ->     ((
    ->       (t4 RD inner JOIN t3 RB ON RD.ID = RB.ID AND RD.REF_NO = RB.REF_NO)
    ->       inner JOIN t2 RRB ON  RD.ID = RRB.ID AND RD.REF_NO = RRB.REF_NO)
    ->       inner JOIN ( SELECT * 
    ->                    FROM t1 
    ->                    GROUP BY t1.REF_NO
    ->                   ) RS ON (RD.ID = RS.ID AND RD.REF_NO = RS.REF_NO)
    ->      )
    -> ) A
    -> where ref_no = 'Jn5tEw';
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
| id   | select_type | table      | type   | possible_keys | key     | key_len | ref                       | rows | Extra                                                     |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
|    1 | PRIMARY     | RD         | ref    | PRIMARY,t4_I1 | t4_I1   | 434     | const                     |    1 | Using where; Using index                                  |
|    1 | PRIMARY     | RRB        | ref    | t2_I1         | t2_I1   | 174     | dbs2.RD.ID,dbs2.RD.REF_NO |    1 | Using where; Using index                                  |
|    1 | PRIMARY     | RB         | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496     | dbs2.RD.ID,dbs2.RD.REF_NO |    1 | Using index                                               |
|    1 | PRIMARY     | <derived3> | ref    | key0          | key0    | 496     | dbs2.RD.REF_NO,dbs2.RD.ID |    2 |                                                           |
|    3 | DERIVED     | t1         | ref    | t1_I1         | t1_I1   | 434     | const                     |    1 | Using where; Using index; Using temporary; Using filesort |
+------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+

Comment by Sergei Petrunia [ 2017-06-22 ]

Found MDEV-13156. It's a similar but different issue.

Comment by Sergei Petrunia [ 2017-06-22 ]

Ok to push

Comment by Igor Babaev [ 2017-06-26 ]

The fix for the bug was pushed into the 10.2 tree.

Generated at Thu Feb 08 08:00:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.