[MDEV-20129] Equality propagation for ORDER BY items do not work with expressions Created: 2019-07-23  Updated: 2023-11-28

Status: Stalled
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2
Fix Version/s: 10.6, 10.11, 11.0, 11.1, 11.2

Type: Bug Priority: Major
Reporter: Varun Gupta (Inactive) Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-8306 Complete cost-based optimization for ... Stalled

 Description   

Test data

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
 
create table one_k(a int);
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
 
create table t1(a int, b int);
insert into t1 select A.a + B.a* 10, A.a + B.a* 10 from ten A, ten B;
create table t2(a int, b int, key(a));
insert into t2 select A.a + B.a* 10, A.a+B.a*10 from ten A, ten B;
create table t3(a int, b int);
insert into t3 select A.a + B.a* 10 + C.a * 100, A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C

MariaDB [test]>  explain select * from t1,t2 where t1.a > 95  and t1.a=t2.a order by t2.a;
+------+-------------+-------+------+---------------+------+---------+-----------+------+-----------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref       | rows | Extra                       |
+------+-------------+-------+------+---------------+------+---------+-----------+------+-----------------------------+
|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL      | 100  | Using where; Using filesort |
|    1 | SIMPLE      | t2    | ref  | a             | a    | 5       | test.t1.a | 1    |                             |
+------+-------------+-------+------+---------------+------+---------+-----------+------+-----------------------------+
2 rows in set (0.008 sec)
 
MariaDB [test]>  explain select * from t1,t2 where t1.a > 95  and t1.a=t2.a order by t2.a+1;
+------+-------------+-------+------+---------------+------+---------+-----------+------+----------------------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref       | rows | Extra                                        |
+------+-------------+-------+------+---------------+------+---------+-----------+------+----------------------------------------------+
|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL      | 100  | Using where; Using temporary; Using filesort |
|    1 | SIMPLE      | t2    | ref  | a             | a    | 5       | test.t1.a | 1    |                                              |
+------+-------------+-------+------+---------------+------+---------+-----------+------+----------------------------------------------+
2 rows in set (0.004 sec)

So for expressions we don't propagate equal fields and so we are not able to use the optimization where we can sort by the first table



 Comments   
Comment by Varun Gupta (Inactive) [ 2019-07-29 ]

Patch
http://lists.askmonty.org/pipermail/commits/2019-July/013914.html

Comment by Sergei Petrunia [ 2020-06-06 ]

Review notes:

The idea behind the new function equality_propagation_for_order_items doesn't look good to me.

The effect of the function is similar (but exactly the same) as propagate_equal_fields call followed by substitute_for_best_equal_field call.

Here is an alternative variant of the fix that uses those two calls: https://gist.github.com/spetrunia/f48251cc87f07cc8e4a8fa9910bbe6c7

Comment by Sergei Petrunia [ 2020-06-06 ]

Another interesting observation:

Before the patch, the optimizer only handles "ORDER BY table.col".

The code in remove_const() checks if it can be substituted with other_table.col, and whether that will allow to get rid of "Using temporary". No substitution is performed here. The substitution is done in Filesort::make_sortorder():

      /*
        It is possible that the query plan is to read table t1, while the
        sort criteria actually has "ORDER BY t2.col" and the WHERE clause has
        a multi-equality(t1.col, t2.col, ...).
        The optimizer detects such cases (grep for
        UseMultipleEqualitiesToRemoveTempTable to see where), but doesn't
        perform equality substitution in the order->item. We need to do the
        substitution here ourselves.
      */
9     table_map item_map= first->used_tables();                                                                                     
      if (join && (item_map & ~join->const_table_map) &&
          !(item_map & first_table_bit) && join->cond_equal &&
           first->get_item_equal())
      {

(item->get_item_equal() returns non-NULL only for Item_field and Item_direct_view_ref. )
Let's call the above "Late substitution".

The patch for this MDEv adds "early substitution": if the Item_field that we need to replace is not at the top level of the ORDER BY clause, it will be replaced in remove_const().

Can we switch the code to do the substitution in only one place and do it "early"?

I've tried to do that and got an assertion failure in create_tmp_table, not sure why.

Generated at Thu Feb 08 08:57:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.