Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8989

ORDER BY optimizer ignores equality propagation

Details

    • 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1, 10.2.1-3, 10.2.1-4, 10.2.1-5

    Description

      Re-filing this here from https://bugzilla.suse.com/show_bug.cgi?id=949520 :

      Consider a query:

      SELECT * FROM Super su
        JOIN SubA sa on sa.id = su.id
      ORDER BY
        sa.id desc
      LIMIT 10

      The join optimizer picks the join order of sa, su.
      Table sa has an index which allows to satisfy ORDER BY LIMIT clause:

      explain SELECT * FROM Super su JOIN SubA sa on su.id = sa.id ORDER BY sa.id desc LIMIT 10;
      +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+
      | id   | select_type | table | type   | possible_keys | key     | key_len | ref                 | rows | Extra |
      +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+
      |    1 | SIMPLE      | sa    | index  | PRIMARY       | PRIMARY | 4       | NULL                |   10 |       |
      |    1 | SIMPLE      | su    | eq_ref | PRIMARY       | PRIMARY | 4       | Test_Database.sa.id |    1 |       |
      +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+

      Good so far.

      Now, let's try to change ORDER BY sa.id into ORDER BY su.id. The query
      has sa.id = su.id, both columns have identical data types, so there should
      be no difference.

      However, there is:

      explain SELECT * FROM Super su JOIN SubA sa on su.id = sa.id ORDER BY su.id desc LIMIT 10;
      +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+
      | id   | select_type | table | type   | possible_keys | key     | key_len | ref                 | rows | Extra                           |
      +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+
      |    1 | SIMPLE      | sa    | ALL    | PRIMARY       | NULL    | NULL    | NULL                | 4000 | Using temporary; Using filesort |
      |    1 | SIMPLE      | su    | eq_ref | PRIMARY       | PRIMARY | 4       | Test_Database.sa.id |    1 |                                 |
      +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+

      ORDER BY optimizer no longer recognizes that index on sa.id produces the desired ordering.

      Attachments

        Issue Links

          Activity

            psergei Sergei Petrunia created issue -
            psergei Sergei Petrunia made changes -
            Field Original Value New Value
            Description Re-filing this here from https://bugzilla.suse.com/show_bug.cgi?id=949520 :

            Consider a query:
            {{noformat}}
            SELECT * FROM Super su
              JOIN SubA sa on sa.id = su.id
            ORDER BY
              sa.id desc
            LIMIT 10
            {{noformat}}

            The join optimizer picks the join order of {{sa, su}}.
            Table {{sa}} has an index which allows to satisfy ORDER BY LIMIT clause:

            {noformat}
            explain SELECT * FROM Super su JOIN SubA sa on su.id = sa.id ORDER BY sa.id desc LIMIT 10;
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+
            | 1 | SIMPLE | sa | index | PRIMARY | PRIMARY | 4 | NULL | 10 | |
            | 1 | SIMPLE | su | eq_ref | PRIMARY | PRIMARY | 4 | Test_Database.sa.id | 1 | |
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+
            {noformat}

            Good so far.

            Now, let's try to change {{ORDER BY sa.id}} into {{ORDER BY su.id}}. The query
            has {{sa.id = su.id}}, both columns have identical data types, so there should
            be no difference.

            However, there is:

            {noformat}
            explain SELECT * FROM Super su JOIN SubA sa on su.id = sa.id ORDER BY su.id desc LIMIT 10;
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+
            | 1 | SIMPLE | sa | ALL | PRIMARY | NULL | NULL | NULL | 4000 | Using temporary; Using filesort |
            | 1 | SIMPLE | su | eq_ref | PRIMARY | PRIMARY | 4 | Test_Database.sa.id | 1 | |
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+
            {noformat}

            ORDER BY optimizer no longer recognizes that index on sa.id produces the desired ordering.
            Re-filing this here from https://bugzilla.suse.com/show_bug.cgi?id=949520 :

            Consider a query:
            {noformat}
            SELECT * FROM Super su
              JOIN SubA sa on sa.id = su.id
            ORDER BY
              sa.id desc
            LIMIT 10
            {noformat}

            The join optimizer picks the join order of {{sa, su}}.
            Table {{sa}} has an index which allows to satisfy ORDER BY LIMIT clause:

            {noformat}
            explain SELECT * FROM Super su JOIN SubA sa on su.id = sa.id ORDER BY sa.id desc LIMIT 10;
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+
            | 1 | SIMPLE | sa | index | PRIMARY | PRIMARY | 4 | NULL | 10 | |
            | 1 | SIMPLE | su | eq_ref | PRIMARY | PRIMARY | 4 | Test_Database.sa.id | 1 | |
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+-------+
            {noformat}

            Good so far.

            Now, let's try to change {{ORDER BY sa.id}} into {{ORDER BY su.id}}. The query
            has {{sa.id = su.id}}, both columns have identical data types, so there should
            be no difference.

            However, there is:

            {noformat}
            explain SELECT * FROM Super su JOIN SubA sa on su.id = sa.id ORDER BY su.id desc LIMIT 10;
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+
            | 1 | SIMPLE | sa | ALL | PRIMARY | NULL | NULL | NULL | 4000 | Using temporary; Using filesort |
            | 1 | SIMPLE | su | eq_ref | PRIMARY | PRIMARY | 4 | Test_Database.sa.id | 1 | |
            +------+-------------+-------+--------+---------------+---------+---------+---------------------+------+---------------------------------+
            {noformat}

            ORDER BY optimizer no longer recognizes that index on sa.id produces the desired ordering.
            psergei Sergei Petrunia added a comment - - edited

            The problem seems to be much easier than MDEV-4205 or MDEV-8306. The presence of sa.id=su.id equality (and indexes on both of these columns) means that either join order is good for optimizing ORDER BY ... LIMIT.

            psergei Sergei Petrunia added a comment - - edited The problem seems to be much easier than MDEV-4205 or MDEV-8306 . The presence of sa.id=su.id equality (and indexes on both of these columns) means that either join order is good for optimizing ORDER BY ... LIMIT.

            Debugging, I see that the opportunity is missed in test_if_skip_sort_order().

            The optimizer calls build_equal_items() and substitute_for_best_equal() for HAVING, WHERE, and ON expressions but not for ORDER BY.

            I've tried adding substitute_for_best_equal() call

            --- a/sql/sql_select.cc
            +++ b/sql/sql_select.cc
            @@ -1566,6 +1566,16 @@ int JOIN::optimize()
             
               /* Optimize distinct away if possible */
               {
            +    // psergey-add: Substitute for order
            +    
            +    for (ORDER *cur_ord= order; cur_ord; cur_ord=cur_ord->next)
            +    {
            +      cur_ord->item[0]= 
            +        substitute_for_best_equal_field(thd, NO_PARTICULAR_TAB,
            +                                        cur_ord->item[0], cond_equal, 
            +                                        map2table);
            +    }
            +    // :psergey-add
                 ORDER *org_order= order;
                 order=remove_const(this, order,conds,1, &simple_order);
                 if (thd->is_error())

            but this didn't work. substitute_for_best_equal relies on build_equal_items() to have been done before, e.g. it sets Item_field::item_equal which substitute_for_best_equal() relies on.

            We don't want to call build_equal_items() for ORDER BY expression - there is no point in building equality sets, etc. There needs to be a simpler function.

            psergei Sergei Petrunia added a comment - Debugging, I see that the opportunity is missed in test_if_skip_sort_order(). The optimizer calls build_equal_items() and substitute_for_best_equal() for HAVING, WHERE, and ON expressions but not for ORDER BY. I've tried adding substitute_for_best_equal() call --- a/sql/sql_select.cc +++ b/sql/sql_select.cc @@ -1566,6 +1566,16 @@ int JOIN::optimize() /* Optimize distinct away if possible */ { + // psergey-add: Substitute for order + + for (ORDER *cur_ord= order; cur_ord; cur_ord=cur_ord->next) + { + cur_ord->item[0]= + substitute_for_best_equal_field(thd, NO_PARTICULAR_TAB, + cur_ord->item[0], cond_equal, + map2table); + } + // :psergey-add ORDER *org_order= order; order=remove_const(this, order,conds,1, &simple_order); if (thd->is_error()) but this didn't work. substitute_for_best_equal relies on build_equal_items() to have been done before, e.g. it sets Item_field::item_equal which substitute_for_best_equal() relies on. We don't want to call build_equal_items() for ORDER BY expression - there is no point in building equality sets, etc. There needs to be a simpler function.

            Probably need to use Item_field::find_item_equal. (Q: does this function check that it is really ok to do the substitution? considering possible different datatypes/collations, etc?)

            Also would like to look at this: there are actually two kinds of equality substitution. The one I'm talking about here is where t1.col1 is substituted for t2.col2 in arbitrary context.

            There is also substitution where t1.col1 is substituted for t2.col2 only when used in range-style comparisons, e.g. "t1.col1 < foo" is substuted with "t2.col2 < foo" while func(t1.col1) is not substituted with func(t2.col2). I can't find it the code. WIll need to discuss with other team members.

            psergei Sergei Petrunia added a comment - Probably need to use Item_field::find_item_equal. (Q: does this function check that it is really ok to do the substitution? considering possible different datatypes/collations, etc?) Also would like to look at this: there are actually two kinds of equality substitution. The one I'm talking about here is where t1.col1 is substituted for t2.col2 in arbitrary context. There is also substitution where t1.col1 is substituted for t2.col2 only when used in range-style comparisons, e.g. "t1.col1 < foo" is substuted with "t2.col2 < foo" while func(t1.col1) is not substituted with func(t2.col2). I can't find it the code. WIll need to discuss with other team members.

            Yet another question: can substitute ORDER::item member with something else? What happens if this is a PS and we will need another execution?

            psergei Sergei Petrunia added a comment - Yet another question: can substitute ORDER::item member with something else? What happens if this is a PS and we will need another execution?
            Werkov Michal Koutny added a comment -

            Hi Sergei. What is progress with this issue? Did you discussed the suggested solutions within your team (I hope the the questions didn't fail to reach them)? Thanks, for any update.

            Werkov Michal Koutny added a comment - Hi Sergei. What is progress with this issue? Did you discussed the suggested solutions within your team (I hope the the questions didn't fail to reach them)? Thanks, for any update.
            serg Sergei Golubchik made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.1 [ 16100 ]
            serg Sergei Golubchik made changes -
            Sprint 10.1.10 [ 24 ]

            Investigation notes:

            field substitution in ORDER BY field should be done before
            the remove_const() call in JOIN::optimize().
            If we don't do it by then, remove_const() will join->simple_order=0
            which will cause temp. table to be used.

            We should use multi-equalities from WHERE's top level. They are in
            JOIN::cond_equal->current_level.

            I cannot yet understand what is the right way to do substitutions.
            struct ORDER has:

              Item	 **item;			/* Point at item in select fields */
              Item	 *item_ptr;			/* Storage for initial item */

            initially

            item[0]=item_ptr = {pointer to item}

            , but tthen setup_order()/find_item_order_in_list() changes order->item to point into ref_pointer_array.

            If the ORDER BY item is not present in ref_pointer_array, it adds it there, and also adds it into all_fields:

              all_fields.push_front(order_item, thd->mem_root);
              ref_pointer_array[el]= order_item;

            psergei Sergei Petrunia added a comment - Investigation notes: field substitution in ORDER BY field should be done before the remove_const() call in JOIN::optimize(). If we don't do it by then, remove_const() will join->simple_order=0 which will cause temp. table to be used. We should use multi-equalities from WHERE's top level. They are in JOIN::cond_equal->current_level. I cannot yet understand what is the right way to do substitutions. struct ORDER has: Item **item; /* Point at item in select fields */ Item *item_ptr; /* Storage for initial item */ initially item[0]=item_ptr = {pointer to item} , but tthen setup_order()/find_item_order_in_list() changes order->item to point into ref_pointer_array . If the ORDER BY item is not present in ref_pointer_array , it adds it there, and also adds it into all_fields : all_fields.push_front(order_item, thd->mem_root); ref_pointer_array[el]= order_item;

            Notes from discussion with sanja:

            ORDER BY elements are put into ref_pointer_array because the optimizer
            might need to put them into a temporary table (when {{Using temporary; Using
            filesort}} plan is used).

            It should be ok to replace the element in ref_pointer_array and
            all_fields if it was added there by setup_order/find_order_in_list().

            We can tell whether this happened by checking ORDER::in_field_list.

            If we are about to replace t1.colX with t2.colY in the ORDER BY clause,
            and

            • t1.colX was originally present in ref_pointer_array/all_fields
            • t2.colY is not present in ref_pointer_array/all_fields

            then we should add t2.colY into these structures, just like
            setup_order/find_order_in_list does.

            This is safe to do. There is an upper bound on the number of elements in
            ref_pointer_array, but that bound already includes what we're doing.

            For prepared statements, ORDER BY list is restored on every execution (check
            out this code in reinit_stmt():

                  for (order= sl->group_list.first; order; order= order->next)
                    order->item= &order->item_ptr;

            psergei Sergei Petrunia added a comment - Notes from discussion with sanja : ORDER BY elements are put into ref_pointer_array because the optimizer might need to put them into a temporary table (when {{Using temporary; Using filesort}} plan is used). It should be ok to replace the element in ref_pointer_array and all_fields if it was added there by setup_order/find_order_in_list(). We can tell whether this happened by checking ORDER::in_field_list. If we are about to replace t1.colX with t2.colY in the ORDER BY clause, and t1.colX was originally present in ref_pointer_array / all_fields t2.colY is not present in ref_pointer_array / all_fields then we should add t2.colY into these structures, just like setup_order/find_order_in_list does. This is safe to do. There is an upper bound on the number of elements in ref_pointer_array , but that bound already includes what we're doing. For prepared statements, ORDER BY list is restored on every execution (check out this code in reinit_stmt(): for (order= sl->group_list.first; order; order= order->next) order->item= &order->item_ptr;
            psergei Sergei Petrunia made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            serg Sergei Golubchik made changes -
            Rank Ranked higher
            Werkov Michal Koutny added a comment -

            Hi, I'm just wondering how close/far is this to getting solved? It seemed rather promising to me previously. Thanks.

            Werkov Michal Koutny added a comment - Hi, I'm just wondering how close/far is this to getting solved? It seemed rather promising to me previously. Thanks.
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10 [ 24 ] 10.1.10, 10.2.0-4 [ 24, 29 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher

            Werkov, I now have a "candidate" patch. It needs to be tested and it needs to pass a code review. I'll be posting updates here.

            psergei Sergei Petrunia added a comment - Werkov , I now have a "candidate" patch. It needs to be tested and it needs to pass a code review. I'll be posting updates here.

            When implementing the fix, I've hit test failures like this:

            main.subselect                           w1 [ fail ]
                    Test ended at 2016-01-14 11:57:38
             
            CURRENT_TEST: main.subselect
            --- /home/psergey/dev-git/10.1-dbg3/mysql-test/r/subselect.result
            +++ /home/psergey/dev-git/10.1-dbg3/mysql-test/r/subselect.reject
            @@ -4220,9 +4220,6 @@
             WHERE st IN ('GA','FL') AND EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id)
             GROUP BY id;
             id     st
            -1      GA
            -3      FL
            -7      FL
             SELECT id, st FROM t1
             WHERE st IN ('GA','FL') AND NOT EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id);
             id     st
             
            mysqltest: Result length mismatch

            Investigation shows that ref access uses an incorrect key when constructing key lookup value:

            (gdb) wher
              #0  field_conv (to=0x7fff58087d20, from=0x7fff5808c6f8) at /home/psergey/dev-git/10.1-dbg3/sql/field_conv.cc:873
              #1  0x0000555555ca360d in save_field_in_field (from=0x7fff5808c6f8, null_value=0x7fff5800722a, to=0x7fff58087d20, no_conversions=true) at /home/psergey/dev-git/10.1-dbg3/sql/item.cc:5766
              #2  0x0000555555ca387d in Item_field::save_in_field (this=0x7fff580071b8, to=0x7fff58087d20, no_conversions=true) at /home/psergey/dev-git/10.1-dbg3/sql/item.cc:5822
              #3  0x0000555555acac44 in store_key_item::copy_inner (this=0x7fff58087ce8) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.h:1757
              #4  0x0000555555aca833 in store_key::copy (this=0x7fff58087ce8) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.h:1647
              #5  0x0000555555abafa7 in cp_buffer_from_ref (thd=0x55555aacfd80, table=0x7fff58083278, ref=0x7fff58087170) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:21938
              #6  0x0000555555abaee0 in cmp_buffer_with_ref (thd=0x55555aacfd80, table=0x7fff58083278, tab_ref=0x7fff58087170) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:21920
              #7  0x0000555555ab3ca9 in join_read_key2 (thd=0x55555aacfd80, tab=0x7fff58086f48, table=0x7fff58083278, table_ref=0x7fff58087170) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:19057
              #8  0x0000555555ab3c0a in join_read_key (tab=0x7fff58086f48) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:19035
              #9  0x0000555555ab25d9 in sub_select (join=0x7fff58007910, join_tab=0x7fff58086f48, end_of_records=false) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:18404
              #10 0x0000555555ab2d46 in evaluate_join_record (join=0x7fff58007910, join_tab=0x7fff58086c00, error=0) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:18632
              #11 0x0000555555ab263b in sub_select (join=0x7fff58007910, join_tab=0x7fff58086c00, end_of_records=false) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:18407
              #12 0x0000555555ab1e9b in do_select (join=0x7fff58007910, fields=0x0, table=0x7fff5808b238, procedure=0x0) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:18062
              #13 0x0000555555a8c37a in JOIN::exec_inner (this=0x7fff58007910) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:2898
              #14 0x0000555555a8b3ed in JOIN::exec (this=0x7fff58007910) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:2644
              #15 0x0000555555a8e991 in mysql_select (thd=0x55555aacfd80, rref_pointer_array=0x55555aad40f0, tables=0x7fff580055e8, wild_num=0, fields=..., conds=0x7fff580075a0, og_num=1, order=0x0, group=0x7fff580077c0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fff580078f0, unit=0x55555aad3788, select_lex=0x55555aad3e78) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:3582
              #16 0x0000555555a84071 in handle_select (thd=0x55555aacfd80, lex=0x55555aad36c0, result=0x7fff580078f0, setup_tables_done_option=0) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:384
              #17 0x0000555555a54e94 in execute_sqlcom_select (thd=0x55555aacfd80, all_tables=0x7fff580055e8) at /home/psergey/dev-git/10.1-dbg3/sql/sql_parse.cc:5902
              #18 0x0000555555a4ad54 in mysql_execute_command (thd=0x55555aacfd80) at /home/psergey/dev-git/10.1-dbg3/sql/sql_parse.cc:2961
              #19 0x0000555555a58513 in mysql_parse (thd=0x55555aacfd80, rawbuf=0x7fff58005258 "SELECT id, st FROM t1 WHERE st IN ('GA','FL') AND EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id) GROUP BY id", length=104, parser_state=0x7ffff4301100) at /home/psergey/dev-git/10.1-dbg3/sql/sql_parse.cc:7302

            store_key_item::copy_inner calls

                  res= item->save_in_field(to_field, 1);

            which does:

              return save_field_in_field(result_field, &null_value, to, no_conversions);

            That is it copies data from Item_field::result_field. Normally, Item_field objects have result_field==field, and it doesn't matter.

            However, my patch puts the Item_field object into GROUP BY clause. create_tmp_table() sets item_field->result_field to point into the GROUP-BY temporary table.

            As a result, we get key values from the temporary table. This is wrong, because we're still at doing-the-join phase of the query and should get the data from Item_field->field.

            psergei Sergei Petrunia added a comment - When implementing the fix, I've hit test failures like this: main.subselect w1 [ fail ] Test ended at 2016-01-14 11:57:38   CURRENT_TEST: main.subselect --- /home/psergey/dev-git/10.1-dbg3/mysql-test/r/subselect.result +++ /home/psergey/dev-git/10.1-dbg3/mysql-test/r/subselect.reject @@ -4220,9 +4220,6 @@ WHERE st IN ('GA','FL') AND EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id) GROUP BY id; id st -1 GA -3 FL -7 FL SELECT id, st FROM t1 WHERE st IN ('GA','FL') AND NOT EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id); id st   mysqltest: Result length mismatch Investigation shows that ref access uses an incorrect key when constructing key lookup value: (gdb) wher #0 field_conv (to=0x7fff58087d20, from=0x7fff5808c6f8) at /home/psergey/dev-git/10.1-dbg3/sql/field_conv.cc:873 #1 0x0000555555ca360d in save_field_in_field (from=0x7fff5808c6f8, null_value=0x7fff5800722a, to=0x7fff58087d20, no_conversions=true) at /home/psergey/dev-git/10.1-dbg3/sql/item.cc:5766 #2 0x0000555555ca387d in Item_field::save_in_field (this=0x7fff580071b8, to=0x7fff58087d20, no_conversions=true) at /home/psergey/dev-git/10.1-dbg3/sql/item.cc:5822 #3 0x0000555555acac44 in store_key_item::copy_inner (this=0x7fff58087ce8) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.h:1757 #4 0x0000555555aca833 in store_key::copy (this=0x7fff58087ce8) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.h:1647 #5 0x0000555555abafa7 in cp_buffer_from_ref (thd=0x55555aacfd80, table=0x7fff58083278, ref=0x7fff58087170) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:21938 #6 0x0000555555abaee0 in cmp_buffer_with_ref (thd=0x55555aacfd80, table=0x7fff58083278, tab_ref=0x7fff58087170) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:21920 #7 0x0000555555ab3ca9 in join_read_key2 (thd=0x55555aacfd80, tab=0x7fff58086f48, table=0x7fff58083278, table_ref=0x7fff58087170) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:19057 #8 0x0000555555ab3c0a in join_read_key (tab=0x7fff58086f48) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:19035 #9 0x0000555555ab25d9 in sub_select (join=0x7fff58007910, join_tab=0x7fff58086f48, end_of_records=false) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:18404 #10 0x0000555555ab2d46 in evaluate_join_record (join=0x7fff58007910, join_tab=0x7fff58086c00, error=0) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:18632 #11 0x0000555555ab263b in sub_select (join=0x7fff58007910, join_tab=0x7fff58086c00, end_of_records=false) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:18407 #12 0x0000555555ab1e9b in do_select (join=0x7fff58007910, fields=0x0, table=0x7fff5808b238, procedure=0x0) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:18062 #13 0x0000555555a8c37a in JOIN::exec_inner (this=0x7fff58007910) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:2898 #14 0x0000555555a8b3ed in JOIN::exec (this=0x7fff58007910) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:2644 #15 0x0000555555a8e991 in mysql_select (thd=0x55555aacfd80, rref_pointer_array=0x55555aad40f0, tables=0x7fff580055e8, wild_num=0, fields=..., conds=0x7fff580075a0, og_num=1, order=0x0, group=0x7fff580077c0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fff580078f0, unit=0x55555aad3788, select_lex=0x55555aad3e78) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:3582 #16 0x0000555555a84071 in handle_select (thd=0x55555aacfd80, lex=0x55555aad36c0, result=0x7fff580078f0, setup_tables_done_option=0) at /home/psergey/dev-git/10.1-dbg3/sql/sql_select.cc:384 #17 0x0000555555a54e94 in execute_sqlcom_select (thd=0x55555aacfd80, all_tables=0x7fff580055e8) at /home/psergey/dev-git/10.1-dbg3/sql/sql_parse.cc:5902 #18 0x0000555555a4ad54 in mysql_execute_command (thd=0x55555aacfd80) at /home/psergey/dev-git/10.1-dbg3/sql/sql_parse.cc:2961 #19 0x0000555555a58513 in mysql_parse (thd=0x55555aacfd80, rawbuf=0x7fff58005258 "SELECT id, st FROM t1 WHERE st IN ('GA','FL') AND EXISTS(SELECT 1 FROM t2 WHERE t2.id=t1.id) GROUP BY id", length=104, parser_state=0x7ffff4301100) at /home/psergey/dev-git/10.1-dbg3/sql/sql_parse.cc:7302 store_key_item::copy_inner calls res= item->save_in_field(to_field, 1); which does: return save_field_in_field(result_field, &null_value, to, no_conversions); That is it copies data from Item_field::result_field. Normally, Item_field objects have result_field==field, and it doesn't matter. However, my patch puts the Item_field object into GROUP BY clause. create_tmp_table() sets item_field->result_field to point into the GROUP-BY temporary table. As a result, we get key values from the temporary table. This is wrong, because we're still at doing-the-join phase of the query and should get the data from Item_field->field.

            Fix for the above problem: https://gist.github.com/spetrunia/3edfdd131c103f538030 . sanja, this is what we've discussed on the optimizer call. Can you review it please?

            psergei Sergei Petrunia added a comment - Fix for the above problem: https://gist.github.com/spetrunia/3edfdd131c103f538030 . sanja , this is what we've discussed on the optimizer call. Can you review it please?

            The first version of the patch: http://lists.askmonty.org/pipermail/commits/2016-January/008826.html . igor questioned its approach on the optimizer call, will need to catch with him to discuss it.

            psergei Sergei Petrunia added a comment - The first version of the patch: http://lists.askmonty.org/pipermail/commits/2016-January/008826.html . igor questioned its approach on the optimizer call, will need to catch with him to discuss it.
            sanja Oleksandr Byelkin made changes -
            Assignee Sergei Petrunia [ psergey ] Oleksandr Byelkin [ sanja ]
            sanja Oleksandr Byelkin made changes -
            Status In Progress [ 3 ] In Review [ 10002 ]

            OK to push.

            sanja Oleksandr Byelkin added a comment - OK to push.
            sanja Oleksandr Byelkin made changes -
            Assignee Oleksandr Byelkin [ sanja ] Sergei Petrunia [ psergey ]
            Status In Review [ 10002 ] Stalled [ 10000 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10, 10.2.0-4 [ 24, 29 ] 10.1.10, 10.2.0-4, 10.1.11 [ 24, 29, 30 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            psergei Sergei Petrunia added a comment - - edited

            Following suggestion by igor, started to develop a fix that makes use of multiple equalities without doing the substitution: https://gist.github.com/spetrunia/8dd6f05759beb06911f1 . The fix passes the bug example. I suspect, it doesnt pass in the cases where Using temporary; Using filesort was changed into Using filesort. Need to discuss this with Igor again.

            psergei Sergei Petrunia added a comment - - edited Following suggestion by igor , started to develop a fix that makes use of multiple equalities without doing the substitution: https://gist.github.com/spetrunia/8dd6f05759beb06911f1 . The fix passes the bug example. I suspect, it doesnt pass in the cases where Using temporary; Using filesort was changed into Using filesort . Need to discuss this with Igor again.

            Discussion results

            • We need to do substitution in the ORDER clause when Using temporary; Using filesort was changed into Using filesort. It can be done in create_sort_index, right before the filesort call.
            • The posted patch does searches through multiple equalities. A better way to do it is to walk through the ORDER/GROUP BY clause once and set Item_field::item_equal. Then, we can use that field.
            psergei Sergei Petrunia added a comment - Discussion results We need to do substitution in the ORDER clause when Using temporary; Using filesort was changed into Using filesort . It can be done in create_sort_index , right before the filesort call. The posted patch does searches through multiple equalities. A better way to do it is to walk through the ORDER/GROUP BY clause once and set Item_field::item_equal. Then, we can use that field.
            Werkov Michal Koutny added a comment -

            Hello. Will the patch from this comment be eventually merged wrt to the following comment? Or is it still waiting for some refactoring? Thanks.

            Werkov Michal Koutny added a comment - Hello. Will the patch from this comment be eventually merged wrt to the following comment ? Or is it still waiting for some refactoring? Thanks.
            nirbhay_c Nirbhay Choubey (Inactive) made changes -
            Labels order-by-optimization SUSE order-by-optimization
            psergei Sergei Petrunia made changes -
            Sprint 10.1.10, 10.2.0-4, 10.1.11 [ 24, 29, 30 ] 10.1.10, 10.2.0-4, 10.1.11, 10.1.12 [ 24, 29, 30, 36 ]

            http://lists.askmonty.org/pipermail/commits/2016-February/009026.html . This is the second variant of the fix. igor, can you please review?

            psergei Sergei Petrunia added a comment - http://lists.askmonty.org/pipermail/commits/2016-February/009026.html . This is the second variant of the fix. igor , can you please review?

            Werkov, Sanja' s approval was for the patch https://gist.github.com/spetrunia/3edfdd131c103f538030. It is just a pre-requisite
            for the first variant of the fix. The first fix is usable, but we don't like the code structure.

            Now, I've committed the second variant. It will need to pass the review, first.

            psergei Sergei Petrunia added a comment - Werkov , Sanja' s approval was for the patch https://gist.github.com/spetrunia/3edfdd131c103f538030 . It is just a pre-requisite for the first variant of the fix. The first fix is usable, but we don't like the code structure. Now, I've committed the second variant. It will need to pass the review, first.
            Werkov Michal Koutny added a comment -

            Sergei, thanks for explanation. Do you have any feedback on the second variant?

            Werkov Michal Koutny added a comment - Sergei, thanks for explanation. Do you have any feedback on the second variant?
            Werkov Michal Koutny added a comment -

            I see the bug is still Stalled and it's unclear what is it blocked on. Originally it was estimated for 10.1.10 and 10.1.12 is out now Could it please get some more attention? Thanks.

            Werkov Michal Koutny added a comment - I see the bug is still Stalled and it's unclear what is it blocked on. Originally it was estimated for 10.1.10 and 10.1.12 is out now Could it please get some more attention? Thanks.
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10, 10.2.0-4, 10.1.11, 10.1.12 [ 24, 29, 30, 36 ] 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.1.14 [ 24, 29, 30, 36, 51 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.1.14 [ 24, 29, 30, 36, 51 ] 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1 [ 24, 29, 30, 36, 56 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            Werkov Michal Koutny added a comment -

            Hi, in e-mail from Sergei I got on 2016-04-11, today was mentioned as the latest day for this to resolve. However, I can only see the bug got assigned to another sprint. Is there any progress not visible here? What is the next step? Can this please get more attention? Thanks.

            Werkov Michal Koutny added a comment - Hi, in e-mail from Sergei I got on 2016-04-11, today was mentioned as the latest day for this to resolve. However, I can only see the bug got assigned to another sprint. Is there any progress not visible here? What is the next step? Can this please get more attention? Thanks.

            Hi Michal,
            yes, there is progress: the third variant of the patch is committed: http://lists.askmonty.org/pipermail/commits/2016-May/009353.html. It will need to be reviewed, now.

            psergei Sergei Petrunia added a comment - Hi Michal, yes, there is progress: the third variant of the patch is committed: http://lists.askmonty.org/pipermail/commits/2016-May/009353.html . It will need to be reviewed, now.
            Werkov Michal Koutny added a comment -

            Great, thanks for update. Do you have time estimate, will it fit into the current sprint (10.2.1-1)?

            Werkov Michal Koutny added a comment - Great, thanks for update. Do you have time estimate, will it fit into the current sprint (10.2.1-1)?
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1 [ 24, 29, 30, 36, 56 ] 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1, 10.2.1-2 [ 24, 29, 30, 36, 56, 63 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1, 10.2.1-2 [ 24, 29, 30, 36, 56, 63 ] 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1 [ 24, 29, 30, 36, 56 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            Werkov Michal Koutny added a comment -

            Oh, me again. What's the state of this bug? AFAICS it's not included in the current sprint, what are plans then?

            Werkov Michal Koutny added a comment - Oh, me again. What's the state of this bug? AFAICS it's not included in the current sprint, what are plans then?
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1 [ 24, 29, 30, 36, 56 ] 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1, 10.2.1-3 [ 24, 29, 30, 36, 56, 65 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower

            Hi Michal.

            The bug was included in the sprint (and was in the works regardless of that).

            psergei Sergei Petrunia added a comment - Hi Michal. The bug was included in the sprint (and was in the works regardless of that).

            Status update:

            During the review igor has mentioned on the phone call that propagate_equal_fields call should be used. He has mentioned that this will resolve a possible issue with VIEWs.

            I'm now trying to make sense of this comment.

            psergei Sergei Petrunia added a comment - Status update: During the review igor has mentioned on the phone call that propagate_equal_fields call should be used. He has mentioned that this will resolve a possible issue with VIEWs. I'm now trying to make sense of this comment.

            The mentioned issue is described in a comment to this commit:

            https://github.com/MariaDB/server/commit/8d9dd21d85e257051b45b2f779dcd9bf696bf9e1

            When a view column is referenced in the query an Item_direct_view_ref
            object is created that is refers to the Item_field for the column.
            All references to the same view column refer to the same Item_field.
             
            Different references can belong to different AND/OR levels and,
            as a result, can be included in different Item_equal object.
            These Item_equal objects may include different constant objects.
            ...
            

            psergei Sergei Petrunia added a comment - The mentioned issue is described in a comment to this commit: https://github.com/MariaDB/server/commit/8d9dd21d85e257051b45b2f779dcd9bf696bf9e1 When a view column is referenced in the query an Item_direct_view_ref object is created that is refers to the Item_field for the column. All references to the same view column refer to the same Item_field.   Different references can belong to different AND/OR levels and, as a result, can be included in different Item_equal object. These Item_equal objects may include different constant objects. ...
            Werkov Michal Koutny added a comment -

            Thanks for update. Hopefully use of propagate_equal_fields will make the patch eventually acceptable.

            Werkov Michal Koutny added a comment - Thanks for update. Hopefully use of propagate_equal_fields will make the patch eventually acceptable.
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1, 10.2.1-3 [ 24, 29, 30, 36, 56, 65 ] 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1, 10.2.1-3, 10.2.1-4 [ 24, 29, 30, 36, 56, 65, 66 ]

            Status update:

            • The updated patch received tentative review approval
            • Requested a testing pass from elenst.
            psergei Sergei Petrunia added a comment - Status update: The updated patch received tentative review approval Requested a testing pass from elenst .

            Testing is in progress. So far no failures revealed, but it's not finished yet.

            elenst Elena Stepanova added a comment - Testing is in progress. So far no failures revealed, but it's not finished yet.

            Update:

            • Got a reply from Igor (the reviewer) with ok to push the patch
            • The patch will be protected by an @@optimizer_switch flag
              = it will be pushed into 10.1, with being off by default
              = and into 10.2, with being ON by default.
            psergei Sergei Petrunia added a comment - Update: Got a reply from Igor (the reviewer) with ok to push the patch The patch will be protected by an @@optimizer_switch flag = it will be pushed into 10.1, with being off by default = and into 10.2, with being ON by default.
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1, 10.2.1-3, 10.2.1-4 [ 24, 29, 30, 36, 56, 65, 66 ] 10.1.10, 10.2.0-4, 10.1.11, 10.1.12, 10.2.1-1, 10.2.1-3, 10.2.1-4, 10.2.1-5 [ 24, 29, 30, 36, 56, 65, 66, 68 ]
            psergei Sergei Petrunia added a comment - - edited

            The patch was pushed into MariaDB-10.1.
            The fix is disabled by default in MariaDB, in order to set it one must use:

            set optimizer_switch='orderby_uses_equalities=on'
            

            The fix will be enabled by default in MariaDB 10.2

            psergei Sergei Petrunia added a comment - - edited The patch was pushed into MariaDB-10.1. The fix is disabled by default in MariaDB, in order to set it one must use: set optimizer_switch='orderby_uses_equalities=on' The fix will be enabled by default in MariaDB 10.2
            psergei Sergei Petrunia made changes -

            We will be able to enable the fix by default in 10.2 after it has been merged to 10.2. Filed MDEV-10174 to track that task.

            psergei Sergei Petrunia added a comment - We will be able to enable the fix by default in 10.2 after it has been merged to 10.2. Filed MDEV-10174 to track that task.
            psergei Sergei Petrunia made changes -
            Fix Version/s 10.1.15 [ 22018 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 72234 ] MariaDB v4 [ 149741 ]

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.