Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12845

subquery with group by not using index

Details

    • 10.2.7-1

    Description

      Hi,

      MariaDB executes subquery with group by without filter
      before the derived table is ready for using.

      Executing large subselects with group by can take some time
      and oracle can filter this type of selects.

      Migrations from oracle to mariadb will be easier, if the optimizer can manage this.

      CREATE TABLE `t1` (
      	`id` INT(11) NOT NULL,
      	`amt` INT(11) NULL DEFAULT NULL,
      	PRIMARY KEY (`id`),
      	INDEX `amt` (`amt`)
      );
       
      CREATE TABLE `t2` (
      	`amt` INT(11) NOT NULL,
      	`somestring` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8_unicode_ci',
      	INDEX `amt_ind` (`amt`)
      );
      

      Inserting some values

      INSERT INTO `t2` (`amt`, `somestring`) VALUES (4, 'AtcsBD');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (3, 'Wqjovg');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (2, 'iWjoMy');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (3, 'eoAkem');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (4, '23Dj9r');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (1, 'Im42UO');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (1, 'qbjV4X');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (1, 'BuO8oI');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (2, '6Lf5Fv');
      INSERT INTO `t2` (`amt`, `somestring`) VALUES (1, 'udFOm8');
       
       
      INSERT INTO `t1` (`id`, `amt`) VALUES (2, 1);
      INSERT INTO `t1` (`id`, `amt`) VALUES (3, 1);
      INSERT INTO `t1` (`id`, `amt`) VALUES (7, 1);
      INSERT INTO `t1` (`id`, `amt`) VALUES (10, 1);
      INSERT INTO `t1` (`id`, `amt`) VALUES (1, 2);
      INSERT INTO `t1` (`id`, `amt`) VALUES (4, 2);
      INSERT INTO `t1` (`id`, `amt`) VALUES (5, 2);
      INSERT INTO `t1` (`id`, `amt`) VALUES (9, 2);
      INSERT INTO `t1` (`id`, `amt`) VALUES (6, 3);
      INSERT INTO `t1` (`id`, `amt`) VALUES (8, 3);
      

      The explain

      explain
      select *
      from t1 a
      inner join 
      (select * from t2 group by amt) b
       on a.amt = b.amt 
      where b.amt = 1;
      

      Output shows no index is using for creating the derived table.

      MariaDB [bughunt]> explain select * from t1 a inner join  (select * from t2 group by amt) b  on a.amt = b.amt  where a.amt = 1 \G
      *************************** 1. row ***************************
                 id: 1
        select_type: PRIMARY
              table: a
               type: ref
      possible_keys: amt
                key: amt
            key_len: 5
                ref: const
               rows: 4
              Extra: Using index
      *************************** 2. row ***************************
                 id: 1
        select_type: PRIMARY
              table: <derived2>
               type: ALL
      possible_keys: NULL
                key: NULL
            key_len: NULL
                ref: NULL
               rows: 4
              Extra: Using where; Using join buffer (flat, BNL join)
      *************************** 3. row ***************************
                 id: 2
        select_type: DERIVED
              table: t2
               type: ALL
      possible_keys: amt_ind
                key: NULL
            key_len: NULL
                ref: NULL
               rows: 10
              Extra: Using where
      3 rows in set (0.00 sec)
       
      }
      

      I don't know, if it is a bug or feature request, if initially create as a bug,
      you can change it, if you want to classify it as a feature request.

      Attachments

        Issue Links

          Activity

            igor Igor Babaev (Inactive) added a comment - - edited

            Let's see in debugger what's going on with the query

            select a
            from 
            ( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t
            where a > 2;
            

            After the merge of the derived table t into the main query we have the query

            select `test`.`t1`.`a` AS `a` from `test`.`t1` join `test`.`v1` where `v1`.`b` = `test`.`t1`.`a` and `test`.`t1`.`a` > 2
            

            The function pushdown_cond_for_derived() takes the condition

             `v1`.`b` = `test`.`t1`.`a` and `test`.`t1`.`a` > 2
            

            and first it tries extract from it a new condition that could be pushed into the materialized view v1. Then when having succeeded the function tries to extract from this condition the one that can be pushed into the where clause of v1.
            Before extracting a condition that can be pushed into v1 the function checks whether there is such a condition. For this check the function calls
            the method check_pushable_cond_for_table() for v1.
            The method traverse the condition and says that there is no pushable condition. When coming to the left operand of the conjunct

            `test`.`t1`.`a` > 2
            

            it sees that this is an item of the class Item_direct_view_ref. As the function traverses the operands the condition formulas with the method walk() it processes the field item to which the Item_direct_view_ref item refers to. Reaching this field item the check method sees that the field `a` does not belong to v1 and there is no equivalence class that would include the field item. From this it is concludes that `test`.`t1`.`a` > 2 cannot be pushed into v1.
            In fact the Item_direct_view_ref item t.a belongs to the equivalence class containing t.a and `v1`.`b`. So the failure return code was premature.
            Apparently the walk() method of traversal does not suit when we want to check whether a predicate is pusheable.

            igor Igor Babaev (Inactive) added a comment - - edited Let's see in debugger what's going on with the query select a from ( select t1.a, v1.b, v1.s from t1, v1 where t1.a = v1.b ) as t where a > 2; After the merge of the derived table t into the main query we have the query select `test`.`t1`.`a` AS `a` from `test`.`t1` join `test`.`v1` where `v1`.`b` = `test`.`t1`.`a` and `test`.`t1`.`a` > 2 The function pushdown_cond_for_derived() takes the condition `v1`.`b` = `test`.`t1`.`a` and `test`.`t1`.`a` > 2 and first it tries extract from it a new condition that could be pushed into the materialized view v1. Then when having succeeded the function tries to extract from this condition the one that can be pushed into the where clause of v1. Before extracting a condition that can be pushed into v1 the function checks whether there is such a condition. For this check the function calls the method check_pushable_cond_for_table() for v1. The method traverse the condition and says that there is no pushable condition. When coming to the left operand of the conjunct `test`.`t1`.`a` > 2 it sees that this is an item of the class Item_direct_view_ref. As the function traverses the operands the condition formulas with the method walk() it processes the field item to which the Item_direct_view_ref item refers to. Reaching this field item the check method sees that the field `a` does not belong to v1 and there is no equivalence class that would include the field item. From this it is concludes that `test`.`t1`.`a` > 2 cannot be pushed into v1. In fact the Item_direct_view_ref item t.a belongs to the equivalence class containing t.a and `v1`.`b`. So the failure return code was premature. Apparently the walk() method of traversal does not suit when we want to check whether a predicate is pusheable.

            A patch to resolve the problem was sent for review.
            With this patch I have the following good execution plan:

            MariaDB [dbs2]> explain          
                -> SELECT 
                ->   A.ID AS ID,
                ->   A.REF_NO AS REF_NO 
                -> FROM
                -> (
                ->   SELECT 
                ->     RD.ID AS ID,
                ->     RD.REF_NO AS REF_NO 
                ->   FROM 
                ->     ((
                ->       (t4 RD inner JOIN t3 RB ON RD.ID = RB.ID AND RD.REF_NO = RB.REF_NO)
                ->       inner JOIN t2 RRB ON  RD.ID = RRB.ID AND RD.REF_NO = RRB.REF_NO)
                ->       inner JOIN ( SELECT * 
                ->                    FROM t1 
                ->                    GROUP BY t1.REF_NO
                ->                   ) RS ON (RD.ID = RS.ID AND RD.REF_NO = RS.REF_NO)
                ->      )
                -> ) A
                -> where ref_no = 'Jn5tEw';
            +------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
            | id   | select_type | table      | type   | possible_keys | key     | key_len | ref                       | rows | Extra                                                     |
            +------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
            |    1 | PRIMARY     | RD         | ref    | PRIMARY,t4_I1 | t4_I1   | 434     | const                     |    1 | Using where; Using index                                  |
            |    1 | PRIMARY     | RRB        | ref    | t2_I1         | t2_I1   | 174     | dbs2.RD.ID,dbs2.RD.REF_NO |    1 | Using where; Using index                                  |
            |    1 | PRIMARY     | RB         | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496     | dbs2.RD.ID,dbs2.RD.REF_NO |    1 | Using index                                               |
            |    1 | PRIMARY     | <derived3> | ref    | key0          | key0    | 496     | dbs2.RD.REF_NO,dbs2.RD.ID |    2 |                                                           |
            |    3 | DERIVED     | t1         | ref    | t1_I1         | t1_I1   | 434     | const                     |    1 | Using where; Using index; Using temporary; Using filesort |
            +------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+
            

            igor Igor Babaev (Inactive) added a comment - A patch to resolve the problem was sent for review. With this patch I have the following good execution plan: MariaDB [dbs2]> explain -> SELECT -> A.ID AS ID, -> A.REF_NO AS REF_NO -> FROM -> ( -> SELECT -> RD.ID AS ID, -> RD.REF_NO AS REF_NO -> FROM -> (( -> (t4 RD inner JOIN t3 RB ON RD.ID = RB.ID AND RD.REF_NO = RB.REF_NO) -> inner JOIN t2 RRB ON RD.ID = RRB.ID AND RD.REF_NO = RRB.REF_NO) -> inner JOIN ( SELECT * -> FROM t1 -> GROUP BY t1.REF_NO -> ) RS ON (RD.ID = RS.ID AND RD.REF_NO = RS.REF_NO) -> ) -> ) A -> where ref_no = 'Jn5tEw'; +------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+ | 1 | PRIMARY | RD | ref | PRIMARY,t4_I1 | t4_I1 | 434 | const | 1 | Using where; Using index | | 1 | PRIMARY | RRB | ref | t2_I1 | t2_I1 | 174 | dbs2.RD.ID,dbs2.RD.REF_NO | 1 | Using where; Using index | | 1 | PRIMARY | RB | eq_ref | PRIMARY,t3_I1 | PRIMARY | 496 | dbs2.RD.ID,dbs2.RD.REF_NO | 1 | Using index | | 1 | PRIMARY | <derived3> | ref | key0 | key0 | 496 | dbs2.RD.REF_NO,dbs2.RD.ID | 2 | | | 3 | DERIVED | t1 | ref | t1_I1 | t1_I1 | 434 | const | 1 | Using where; Using index; Using temporary; Using filesort | +------+-------------+------------+--------+---------------+---------+---------+---------------------------+------+-----------------------------------------------------------+

            Found MDEV-13156. It's a similar but different issue.

            psergei Sergei Petrunia added a comment - Found MDEV-13156 . It's a similar but different issue.

            Ok to push

            psergei Sergei Petrunia added a comment - Ok to push

            The fix for the bug was pushed into the 10.2 tree.

            igor Igor Babaev (Inactive) added a comment - The fix for the bug was pushed into the 10.2 tree.

            People

              igor Igor Babaev (Inactive)
              Richard Richard Stracke
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.