[MDEV-610] LP:898747 - Join optimizer pruning seems to work poorly for semi-joins Created: 2011-12-01  Updated: 2014-03-26  Resolved: 2014-03-26

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug898747.xml    

 Description   

Join optimizer pruning seems to be too aggressive in pruning query plans with semi-joins. Quick investigation in debugger hints at that it is not comparing apples-to-apples when comparing record counts.

As a result, one can observe effects like this:

create table ten (a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
 
create table one_k (a int);
insert into one_k select A.a + 10*B.a + 100*C.a from ten A, ten B, ten C;
 
MariaDB [test]> set optimizer_prune_level=0;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]>  explain select * from one_k A where a in (select B.a from  ten B, ten C where C.a < A.a);
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                      |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
|  1 | PRIMARY     | A     | ALL  | NULL          | NULL | NULL    | NULL | 1000 |                            |
|  1 | PRIMARY     | C     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where                |
|  1 | PRIMARY     | B     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where; FirstMatch(A) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
3 rows in set (0.01 sec)

MariaDB [test]> set optimizer_prune_level=1;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]>  explain select * from one_k A where a in (select B.a from  ten B, ten C where C.a < A.a);
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                                          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------------------+
|  1 | PRIMARY     | B     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Start temporary                                                |
|  1 | PRIMARY     | C     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where                                                    |
|  1 | PRIMARY     | A     | ALL  | NULL          | NULL | NULL    | NULL | 1000 | Using where; End temporary; Using join buffer (flat, BNL join) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------------------+
3 rows in set (0.00 sec)



 Comments   
Comment by Sergei Petrunia [ 2011-12-01 ]

Re: Join optimizer pruning seems to work poorly for semi-joins
Last_query_cost values:

  • The plan with A,C,B and FirstMatch: 22391.696266
  • The plan with B,C,A and Start/End Temorary: 50707.742164
Comment by Rasmus Johansson (Inactive) [ 2011-12-01 ]

Launchpad bug id: 898747

Comment by Elena Stepanova [ 2014-03-18 ]

The problem revealed by this test case was gone between 5.3.5 and 5.3.6, particularly with the following revision:

        ------------------------------------------------------------
        revno: 3479.1.1
        revision-id: psergey@askmonty.org-20120402174154-8y0lzcwc0qycoj3n
        parent: psergey@askmonty.org-20120327104326-uezxq6pz8ulfl3fn
        committer: Sergey Petrunya <psergey@askmonty.org>
        branch nick: 5.3-look2
        timestamp: Mon 2012-04-02 21:41:54 +0400
        message:
          BUG#913030: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3
          - When doing join optimization, pre-sort the tables so that they mimic the execution
            order we've had with 'semijoin=off'. 
          - That way, we will not get regressions when there are two query plans (the old and the
            new) that have indentical costs but different execution times (because of factors that
            the optimizer was not able to take into account).

If it's just a special case, then probably another example is needed. Otherwise, maybe the bug should be closed?

Comment by Elena Stepanova [ 2014-03-26 ]

Closing for now on the reason given above, please re-open if needed.

Generated at Thu Feb 08 06:30:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.