Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8320

Allow index usage for DATE(datetime_column) = const

Details

    • 10.2.2-2, 10.2.2-3, 10.2.2-1, 10.2.2-2, 10.2.2-3, 10.2.2-4, 10.2.3-1, 10.2.3-2

    Description

      Would it be possible to make DATE() on datetime column sargable in some cases?
      Like rewrite "DATE(col) = const" to "col BETWEEN concat(const, ' 00:00:00') AND concat(const, ' 23:59:59')"

      Other similar cases:

      • "YEAR(col) = const" is almost the same
      • "YEAR(col) = c1 AND MONTH(col) = c2" and other such combinations may be too complex to be worth it.

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment -

            any hints where to start looking in the code to implement this appreciated.

            danblack Daniel Black added a comment - any hints where to start looking in the code to implement this appreciated.

            I'd look first at add_key_fields(). But perhaps I would've been wrong

            serg Sergei Golubchik added a comment - I'd look first at add_key_fields() . But perhaps I would've been wrong

            add_key_fields() is an auxiliary function that used for constructing ref access plans (here, ref includes ref_or_null, or eq_ref). ref access plans are based on equalities (or other conditions that can be reduced to equalities), such that the records can be read with handler->index_read()/handler->index_next_same() calls.

            YEAR(date_col) = const specifies a range. So, the right way would be to use range optimizer to handle this.

            psergei Sergei Petrunia added a comment - add_key_fields() is an auxiliary function that used for constructing ref access plans (here, ref includes ref_or_null , or eq_ref ). ref access plans are based on equalities (or other conditions that can be reduced to equalities), such that the records can be read with handler->index_read()/handler->index_next_same() calls. YEAR(date_col) = const specifies a range. So, the right way would be to use range optimizer to handle this.
            psergei Sergei Petrunia added a comment - - edited

            As for using range optimizer... I'll debug a similar query and point places where changes are be made.

            Let the query be:

            explain select * from t100 where a between '2015-10-10' and '2015-10-12';
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
            | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
            |    1 | SIMPLE      | t100  | range | a             | a    | 4       | NULL |    3 | Using index condition |
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

            The first place is:

              #0  add_key_field (join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0, cond=0x7fff5c005cf8, field=0x7fff5c010a90, eq_func=false, value=0x7fff5c005e48, num_values=2, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4482
              #1  0x0000555555ab03be in add_key_equal_fields (join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0, cond=0x7fff5c005cf8, field_item=0x7fff5c005aa8, eq_func=false, val=0x7fff5c005e48, num_values=2, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4568
              #2  0x0000555555ab0a58 in Item_func_between::add_key_fields (this=0x7fff5c005cf8, join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0x7ffff7eb5b50, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4734
              #3  0x0000555555ab2524 in update_ref_and_keys (thd=0x55555aad3700, keyuse=0x7fff5c0068a0, join_tab=0x7fff5c006e18, tables=1, cond=0x7fff5c005cf8, normal_tables=18446744073709551615, select_lex=0x55555aad77b0, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:5302
              #4  0x0000555555aad6a1 in make_join_statistics (join=0x7fff5c0065a0, tables_list=..., keyuse_array=0x7fff5c0068a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3615
              #5  0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341
              #6  0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023

            This is the preparation stage. We collect in TABLE::const_keys a set of indexes for which range optimization is probably useful. Note the frame with Item_func_between::add_key_fields, I guess Item_func_eq will need a similar function that will check if it is comparing a YEAR(key_col) with a constant

            psergei Sergei Petrunia added a comment - - edited As for using range optimizer... I'll debug a similar query and point places where changes are be made. Let the query be: explain select * from t100 where a between '2015-10-10' and '2015-10-12'; +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | 1 | SIMPLE | t100 | range | a | a | 4 | NULL | 3 | Using index condition | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ The first place is: #0 add_key_field (join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0, cond=0x7fff5c005cf8, field=0x7fff5c010a90, eq_func=false, value=0x7fff5c005e48, num_values=2, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4482 #1 0x0000555555ab03be in add_key_equal_fields (join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0, cond=0x7fff5c005cf8, field_item=0x7fff5c005aa8, eq_func=false, val=0x7fff5c005e48, num_values=2, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4568 #2 0x0000555555ab0a58 in Item_func_between::add_key_fields (this=0x7fff5c005cf8, join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0x7ffff7eb5b50, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4734 #3 0x0000555555ab2524 in update_ref_and_keys (thd=0x55555aad3700, keyuse=0x7fff5c0068a0, join_tab=0x7fff5c006e18, tables=1, cond=0x7fff5c005cf8, normal_tables=18446744073709551615, select_lex=0x55555aad77b0, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:5302 #4 0x0000555555aad6a1 in make_join_statistics (join=0x7fff5c0065a0, tables_list=..., keyuse_array=0x7fff5c0068a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3615 #5 0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341 #6 0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023 This is the preparation stage. We collect in TABLE::const_keys a set of indexes for which range optimization is probably useful. Note the frame with Item_func_between::add_key_fields , I guess Item_func_eq will need a similar function that will check if it is comparing a YEAR(key_col) with a constant

            Construction of range objects happens here:

              #0  SEL_ARG::SEL_ARG (this=0x7fff5c014a60, f=0x7fff5c010a90, min_value_arg=0x7fff5c014a58 "", max_value_arg=0x7fff5c014a58 "") at /home/psergey/dev-git/10.1/sql/opt_range.cc:2479
              #1  0x0000555555988784 in get_mm_leaf (param=0x7ffff7eb3410, conf_func=0x7fff5c005cf8, field=0x7fff5c010a90, key_part=0x7fff5c0147b8, type=Item_func::GE_FUNC, value=0x7fff5c005bb8) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8598
              #2  0x00005555559878c9 in get_mm_parts (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field=0x7fff5c010a90, type=Item_func::GE_FUNC, value=0x7fff5c005bb8, cmp_type=TIME_RESULT) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8280
              #3  0x0000555555985b0b in get_func_mm_tree (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field=0x7fff5c010a90, value=0x0, cmp_type=TIME_RESULT, inv=false) at /home/psergey/dev-git/10.1/sql/opt_range.cc:7604
              #4  0x000055555598652d in get_full_func_mm_tree (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field_item=0x7fff5c005aa8, value=0x0, inv=false) at /home/psergey/dev-git/10.1/sql/opt_range.cc:7931
              #5  0x0000555555986de0 in Item_func_between::get_mm_tree (this=0x7fff5c005cf8, param=0x7ffff7eb3410, cond_ptr=0x7fff5c007eb0) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8117
              #6  0x000055555597c083 in SQL_SELECT::test_quick_select (this=0x7fff5c007ea8, thd=0x55555aad3700, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false, remove_false_parts_of_where=true) at /home/psergey/dev-git/10.1/sql/opt_range.cc:3133
              #7  0x0000555555aac9a4 in get_quick_record_count (thd=0x55555aad3700, select=0x7fff5c007ea8, table=0x7fff5c00f640, keys=0x7fff5c006f98, limit=18446744073709551615, in_outer_join=false) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3362
              #8  0x0000555555aaed53 in make_join_statistics (join=0x7fff5c0065a0, tables_list=..., keyuse_array=0x7fff5c0068a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4004
              #9  0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341
              #10 0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023

            Once SEL_ARG objects are there, the rest of things (getting records_in_range() estimates, handling of AND/OR, multi-component keys) will happen automatically. The tricks is to create the right SEL_ARGs.

            psergei Sergei Petrunia added a comment - Construction of range objects happens here: #0 SEL_ARG::SEL_ARG (this=0x7fff5c014a60, f=0x7fff5c010a90, min_value_arg=0x7fff5c014a58 "", max_value_arg=0x7fff5c014a58 "") at /home/psergey/dev-git/10.1/sql/opt_range.cc:2479 #1 0x0000555555988784 in get_mm_leaf (param=0x7ffff7eb3410, conf_func=0x7fff5c005cf8, field=0x7fff5c010a90, key_part=0x7fff5c0147b8, type=Item_func::GE_FUNC, value=0x7fff5c005bb8) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8598 #2 0x00005555559878c9 in get_mm_parts (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field=0x7fff5c010a90, type=Item_func::GE_FUNC, value=0x7fff5c005bb8, cmp_type=TIME_RESULT) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8280 #3 0x0000555555985b0b in get_func_mm_tree (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field=0x7fff5c010a90, value=0x0, cmp_type=TIME_RESULT, inv=false) at /home/psergey/dev-git/10.1/sql/opt_range.cc:7604 #4 0x000055555598652d in get_full_func_mm_tree (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field_item=0x7fff5c005aa8, value=0x0, inv=false) at /home/psergey/dev-git/10.1/sql/opt_range.cc:7931 #5 0x0000555555986de0 in Item_func_between::get_mm_tree (this=0x7fff5c005cf8, param=0x7ffff7eb3410, cond_ptr=0x7fff5c007eb0) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8117 #6 0x000055555597c083 in SQL_SELECT::test_quick_select (this=0x7fff5c007ea8, thd=0x55555aad3700, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false, remove_false_parts_of_where=true) at /home/psergey/dev-git/10.1/sql/opt_range.cc:3133 #7 0x0000555555aac9a4 in get_quick_record_count (thd=0x55555aad3700, select=0x7fff5c007ea8, table=0x7fff5c00f640, keys=0x7fff5c006f98, limit=18446744073709551615, in_outer_join=false) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3362 #8 0x0000555555aaed53 in make_join_statistics (join=0x7fff5c0065a0, tables_list=..., keyuse_array=0x7fff5c0068a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4004 #9 0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341 #10 0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023 Once SEL_ARG objects are there, the rest of things (getting records_in_range() estimates, handling of AND/OR, multi-component keys) will happen automatically. The tricks is to create the right SEL_ARGs.
            psergei Sergei Petrunia added a comment - - edited

            Looking at where SEL_ARG object is made for a query with equality:

             explain select * from t100 where a= '2015-10-10' ;

              #0  SEL_ARG::SEL_ARG (this=0x7fff5c015cb0, f=0x7fff5c010a90, min_value_arg=0x7fff5c015ca8 "", max_value_arg=0x7fff5c015ca8 "") at /home/psergey/dev-git/10.1/sql/opt_range.cc:2479
              #1  0x0000555555988784 in get_mm_leaf (param=0x7ffff7eb3410, conf_func=0x7fff5c006d58, field=0x7fff5c010a90, key_part=0x7fff5c015a08, type=Item_func::EQ_FUNC, value=0x7fff5c005b88) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8598
              #2  0x00005555559878c9 in get_mm_parts (param=0x7ffff7eb3410, cond_func=0x7fff5c006d58, field=0x7fff5c010a90, type=Item_func::EQ_FUNC, value=0x7fff5c005b88, cmp_type=TIME_RESULT) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8280
              #3  0x00005555559872b3 in Item_equal::get_mm_tree (this=0x7fff5c006d58, param=0x7ffff7eb3410, cond_ptr=0x7fff5c008078) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8194
              #4  0x000055555597c083 in SQL_SELECT::test_quick_select (this=0x7fff5c008070, thd=0x55555aad3700, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false, remove_false_parts_of_where=true) at /home/psergey/dev-git/10.1/sql/opt_range.cc:3133
              #5  0x0000555555aac9a4 in get_quick_record_count (thd=0x55555aad3700, select=0x7fff5c008070, table=0x7fff5c00f640, keys=0x7fff5c007128, limit=18446744073709551615, in_outer_join=false) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3362
              #6  0x0000555555aaed53 in make_join_statistics (join=0x7fff5c0064e0, tables_list=..., keyuse_array=0x7fff5c0067e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4004
              #7  0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0064e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341
              #8  0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0064e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023

            It looks like get_mm_leaf is the function which should have the logic that figures that

            YEAR(key_col) = C1  -->   " concat(C1, '-01-01') <=  key_col  <= concat(C1, '-12-31')  " 

            psergei Sergei Petrunia added a comment - - edited Looking at where SEL_ARG object is made for a query with equality: explain select * from t100 where a= '2015-10-10' ; #0 SEL_ARG::SEL_ARG (this=0x7fff5c015cb0, f=0x7fff5c010a90, min_value_arg=0x7fff5c015ca8 "", max_value_arg=0x7fff5c015ca8 "") at /home/psergey/dev-git/10.1/sql/opt_range.cc:2479 #1 0x0000555555988784 in get_mm_leaf (param=0x7ffff7eb3410, conf_func=0x7fff5c006d58, field=0x7fff5c010a90, key_part=0x7fff5c015a08, type=Item_func::EQ_FUNC, value=0x7fff5c005b88) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8598 #2 0x00005555559878c9 in get_mm_parts (param=0x7ffff7eb3410, cond_func=0x7fff5c006d58, field=0x7fff5c010a90, type=Item_func::EQ_FUNC, value=0x7fff5c005b88, cmp_type=TIME_RESULT) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8280 #3 0x00005555559872b3 in Item_equal::get_mm_tree (this=0x7fff5c006d58, param=0x7ffff7eb3410, cond_ptr=0x7fff5c008078) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8194 #4 0x000055555597c083 in SQL_SELECT::test_quick_select (this=0x7fff5c008070, thd=0x55555aad3700, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false, remove_false_parts_of_where=true) at /home/psergey/dev-git/10.1/sql/opt_range.cc:3133 #5 0x0000555555aac9a4 in get_quick_record_count (thd=0x55555aad3700, select=0x7fff5c008070, table=0x7fff5c00f640, keys=0x7fff5c007128, limit=18446744073709551615, in_outer_join=false) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3362 #6 0x0000555555aaed53 in make_join_statistics (join=0x7fff5c0064e0, tables_list=..., keyuse_array=0x7fff5c0067e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4004 #7 0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0064e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341 #8 0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0064e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023 It looks like get_mm_leaf is the function which should have the logic that figures that YEAR(key_col) = C1 --> " concat(C1, '-01-01') <= key_col <= concat(C1, '-12-31') "

            danblack, hopefully these are enough? I hope I didn't miss anything.

            psergei Sergei Petrunia added a comment - danblack , hopefully these are enough? I hope I didn't miss anything.
            danblack Daniel Black added a comment - - edited

            great start, thanks psergei!

            danblack Daniel Black added a comment - - edited great start, thanks psergei !
            jkavalik Jiri Kavalik added a comment -

            Thank you all for hints. I tried to implement this in https://github.com/jkavalik/server/commit/2a66fa9bde198bca78625210f632112586c818e7

            No idea if it is the right way to do such a thing but it seems to work as I imagined. I did not yet manage to run complete test suite (had to put my laptop to sleep and it then crashed on timeout..) but I ran some (innodb, select, explain, type_date, type_datetime directly and then some 1200 from the entire suite before the sleep) without troubles.

            jkavalik Jiri Kavalik added a comment - Thank you all for hints. I tried to implement this in https://github.com/jkavalik/server/commit/2a66fa9bde198bca78625210f632112586c818e7 No idea if it is the right way to do such a thing but it seems to work as I imagined. I did not yet manage to run complete test suite (had to put my laptop to sleep and it then crashed on timeout..) but I ran some (innodb, select, explain, type_date, type_datetime directly and then some 1200 from the entire suite before the sleep) without troubles.

            Commented on the patch in github. Summary: found a number of issues which need to be resolved before this can be pushed.

            psergei Sergei Petrunia added a comment - Commented on the patch in github. Summary: found a number of issues which need to be resolved before this can be pushed.
            jkavalik Jiri Kavalik added a comment -

            Thank you for the comments. I will try to work on those issues.

            jkavalik Jiri Kavalik added a comment - Thank you for the comments. I will try to work on those issues.
            danblack Daniel Black added a comment -

            Minor addition, when these optimisations occur it would be good to see an addition text in the 'Extra' of the explain output like '

            {funct}

            (colref) optimised to range;'

            danblack Daniel Black added a comment - Minor addition, when these optimisations occur it would be good to see an addition text in the 'Extra' of the explain output like ' {funct} (colref) optimised to range;'

            Saw a customer case with conditions like this:

            DATE(table.datetime_col) BETWEEN '2015-08-12' AND '2015-08-12'
             
            DATE(table.date_col) BETWEEN '2015-08-12' AND '2015-08-12'

            The columns have types DATE and DATETIME.

            psergei Sergei Petrunia added a comment - Saw a customer case with conditions like this: DATE(table.datetime_col) BETWEEN '2015-08-12' AND '2015-08-12'   DATE(table.date_col) BETWEEN '2015-08-12' AND '2015-08-12' The columns have types DATE and DATETIME .

            Saw another case where the real-world queries are using conditions in form:

            DATE(date_col) = '2016-06-23'
            

            Sometimes these conditions are OR-ed together. The comparison is always with a constant.
            It could be that date_col is actually a DATETIME.

            psergei Sergei Petrunia added a comment - Saw another case where the real-world queries are using conditions in form: DATE(date_col) = '2016-06-23' Sometimes these conditions are OR-ed together. The comparison is always with a constant. It could be that date_col is actually a DATETIME.

            The patch v0. Makes DATE(date_col) <>= CONST_DATE sargable.
            http://lists.askmonty.org/pipermail/commits/2016-September/009874.html

            holyfoot Alexey Botchkov added a comment - The patch v0. Makes DATE(date_col) <>= CONST_DATE sargable. http://lists.askmonty.org/pipermail/commits/2016-September/009874.html
            holyfoot Alexey Botchkov added a comment - http://lists.askmonty.org/pipermail/commits/2016-September/009874.html

            When this is done, please don't forget to update this kb article: https://mariadb.com/kb/en/mariadb/building-the-best-index-for-a-given-select/

            michaeldg Michaël de groot added a comment - When this is done, please don't forget to update this kb article: https://mariadb.com/kb/en/mariadb/building-the-best-index-for-a-given-select/
            holyfoot Alexey Botchkov added a comment - Test added. http://lists.askmonty.org/pipermail/commits/2016-September/009935.html

            This issue is in review for a while now, any status update?

            Thanks!

            michaeldg Michaël de groot added a comment - This issue is in review for a while now, any status update? Thanks!
            psergei Sergei Petrunia added a comment - - edited

            So the patch works when the WHERE clause has just the condition we are
            targeting:

            explain select * from t1 where date(a) < '2017-01-10' ;
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
            | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
            |    1 | SIMPLE      | t1    | range | a             | a    | 6       | NULL |   90 | Using index condition |
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
            

            but doesn't work with equality propagation:

            explain select * from t1 where date(a1) < '2017-01-10' and a1=a;
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 10122 | Using where |
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            

            Actually, even irrelevant equalities make the patch not to work. Let's add
            "b=b1" to the first query and we get:

            explain select * from t1 where date(a) < '2017-01-10' and b=b1;
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 10122 | Using where |
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            

            Another thing: the patch works only if DATE(col) is on the left side of the equality:

            explain select * from t1 where date(a) = '2017-01-10';
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
            | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
            |    1 | SIMPLE      | t1    | range | a             | a    | 6       | NULL |   10 | Using index condition |
            +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
            

            explain select * from t1 where '2017-01-10'=date(a);
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 10122 | Using where |
            +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            

            psergei Sergei Petrunia added a comment - - edited So the patch works when the WHERE clause has just the condition we are targeting: explain select * from t1 where date(a) < '2017-01-10' ; +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | 1 | SIMPLE | t1 | range | a | a | 6 | NULL | 90 | Using index condition | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ but doesn't work with equality propagation: explain select * from t1 where date(a1) < '2017-01-10' and a1=a; +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10122 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ Actually, even irrelevant equalities make the patch not to work. Let's add "b=b1" to the first query and we get: explain select * from t1 where date(a) < '2017-01-10' and b=b1; +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10122 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ Another thing: the patch works only if DATE(col) is on the left side of the equality: explain select * from t1 where date(a) = '2017-01-10'; +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | 1 | SIMPLE | t1 | range | a | a | 6 | NULL | 10 | Using index condition | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ explain select * from t1 where '2017-01-10'=date(a); +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10122 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
            psergei Sergei Petrunia added a comment - - edited

            Elaborating on Monty's "do a rewrite at fix_fields" proposal.

            Let's consider YEAR(a), where a is a DATETIME column.

            introduce functions:

            YEAR_START(datetime_col) returns CONCAT(YEAR(datetime_col), '-01-01 00:00:00')
            YEAR_END(datetime_col) returns CONCAT(YEAR(datetime_col), '-31-12 23:59:59 ...')
            

            (add more precision if necessary)

            then

            YEAR(col) <= val  is the same as  col <= YEAR_END(val)
            YEAR(col) < val   is the same as  col < YEAR_START(val)
            

            YEAR(col) >= val  is the same as  col >= YEAR_START(val)
            YEAR(col) > val   is the same as  col > YEAR_END(val)
            

            YEAR(col) = val   is the same as  col>=YEAR_START(val) AND col<=YEAR_END(val)
            

            YEAR(col) BETWEEN val1 AND val2 
            

            is the same as

            YEAR(col) >= val2 AND YEAR(col) <= val2
            

            which reduces it to the cases already handled above.

            If we use the rewrite approach, equality propagation will work automatically.

            psergei Sergei Petrunia added a comment - - edited Elaborating on Monty's "do a rewrite at fix_fields" proposal. Let's consider YEAR(a), where a is a DATETIME column. introduce functions: YEAR_START(datetime_col) returns CONCAT(YEAR(datetime_col), '-01-01 00:00:00') YEAR_END(datetime_col) returns CONCAT(YEAR(datetime_col), '-31-12 23:59:59 ...') (add more precision if necessary) then YEAR(col) <= val is the same as col <= YEAR_END(val) YEAR(col) < val is the same as col < YEAR_START(val) YEAR(col) >= val is the same as col >= YEAR_START(val) YEAR(col) > val is the same as col > YEAR_END(val) YEAR(col) = val is the same as col>=YEAR_START(val) AND col<=YEAR_END(val) YEAR(col) BETWEEN val1 AND val2 is the same as YEAR(col) >= val2 AND YEAR(col) <= val2 which reduces it to the cases already handled above. If we use the rewrite approach, equality propagation will work automatically.

            I'm working on a patch to implement the new proposal.

            psergei Sergei Petrunia added a comment - I'm working on a patch to implement the new proposal.

            A patch: http://lists.askmonty.org/pipermail/commits/2017-November/011674.html
            Still need to figure out if we need to take measures to account for leap seconds.

            psergei Sergei Petrunia added a comment - A patch: http://lists.askmonty.org/pipermail/commits/2017-November/011674.html Still need to figure out if we need to take measures to account for leap seconds.

            Somehow, I am unable to observe this leap second.
            I get Ubuntu 16.04.3 LTS, compile mysql-5.7, get the datadir from mysql-test/var/install.db, with non-empty mysql.time_zone* tables, and try to reproduce what is said at: https://dev.mysql.com/doc/refman/5.7/en/time-zone-leap-seconds.html.

            The docs say:

            mysql> -- values differ internally but display the same
            mysql> SELECT a, ts, UNIX_TIMESTAMP(ts) FROM t1;
            +------+---------------------+--------------------+
            | a    | ts                  | UNIX_TIMESTAMP(ts) |
            +------+---------------------+--------------------+
            |    1 | 2008-12-31 23:59:59 |         1230767999 |
            |    2 | 2008-12-31 23:59:59 |         1230768000 |
            +------+---------------------+--------------------+
            2 rows in set (0.00 sec)
            

            while I get:

            mysql> SELECT a, ts, UNIX_TIMESTAMP(ts) FROM t1;
            +------+---------------------+--------------------+
            | a    | ts                  | UNIX_TIMESTAMP(ts) |
            +------+---------------------+--------------------+
            |    1 | 2008-12-31 23:59:59 |         1230767999 |
            |    2 | 2009-01-01 00:00:00 |         1230768000 |
            +------+---------------------+--------------------+
            2 rows in set (0.00 sec)
            
            

            psergei Sergei Petrunia added a comment - Somehow, I am unable to observe this leap second. I get Ubuntu 16.04.3 LTS, compile mysql-5.7, get the datadir from mysql-test/var/install.db, with non-empty mysql.time_zone* tables, and try to reproduce what is said at: https://dev.mysql.com/doc/refman/5.7/en/time-zone-leap-seconds.html . The docs say: mysql> -- values differ internally but display the same mysql> SELECT a, ts, UNIX_TIMESTAMP(ts) FROM t1; +------+---------------------+--------------------+ | a | ts | UNIX_TIMESTAMP(ts) | +------+---------------------+--------------------+ | 1 | 2008-12-31 23:59:59 | 1230767999 | | 2 | 2008-12-31 23:59:59 | 1230768000 | +------+---------------------+--------------------+ 2 rows in set (0.00 sec) while I get: mysql> SELECT a, ts, UNIX_TIMESTAMP(ts) FROM t1; +------+---------------------+--------------------+ | a | ts | UNIX_TIMESTAMP(ts) | +------+---------------------+--------------------+ | 1 | 2008-12-31 23:59:59 | 1230767999 | | 2 | 2009-01-01 00:00:00 | 1230768000 | +------+---------------------+--------------------+ 2 rows in set (0.00 sec)

            Also discussed with bar.

            Takeaways:

            • one needs to have TZ data loaded into the server in order to observe the leap second.
              • Check mysql_tzinfo_to_sql, and --leap parameter.
              • Note that not all timezone files have leap second information (use "file" to see which do)
            • DATE[TIME] type has no leap seconds. only TIMESTAMP does
            • TIMESTAMP has open bugs like MDEV-13995
              • It's better to not have this patch support TIMESTAMP for now. Let's do it just for DATE TIMEs.
              • This means we dont have to support leap seconds.
            psergei Sergei Petrunia added a comment - Also discussed with bar . Takeaways: one needs to have TZ data loaded into the server in order to observe the leap second. Check mysql_tzinfo_to_sql, and --leap parameter. Note that not all timezone files have leap second information (use "file" to see which do) DATE [TIME] type has no leap seconds. only TIMESTAMP does TIMESTAMP has open bugs like MDEV-13995 It's better to not have this patch support TIMESTAMP for now. Let's do it just for DATE TIMEs. This means we dont have to support leap seconds.

            Adjusted the patch according to the above: http://lists.askmonty.org/pipermail/commits/2017-November/011680.html

            A test patch to also make rewrite function a member of Item: http://lists.askmonty.org/pipermail/commits/2017-November/011681.html

            psergei Sergei Petrunia added a comment - Adjusted the patch according to the above: http://lists.askmonty.org/pipermail/commits/2017-November/011680.html A test patch to also make rewrite function a member of Item: http://lists.askmonty.org/pipermail/commits/2017-November/011681.html

            bar, please review.

            psergei Sergei Petrunia added a comment - bar , please review.
            oleg.smirnov Oleg Smirnov added a comment -

            There are test failures looking like

            main.sargable_date_cond                  w3 [ fail ]
                    Test ended at 2023-03-16 04:35:43
             
            CURRENT_TEST: main.sargable_date_cond
            --- /usr/share/mysql/mysql-test/main/sargable_date_cond.result	2023-03-16 01:36:33.000000000 -0400
            +++ /dev/shm/var/3/log/sargable_date_cond.reject	2023-03-16 04:35:42.953159711 -0400
            @@ -1919,6 +1919,8 @@
             timestampadd(hour, B.a, date_add('2016-01-01', interval A.a*8 day)),
             date_add('2016-01-01', interval A.a*7 day)
             from t1 A, t0 B;
            +Warnings:
            +Warning	1299	Invalid TIMESTAMP value in column 'a' at row 11
            

            that occur only on some machines and not on others. Turns out it depends on whether the system time zone employs daylight saving time.

            If it does then calling timestampadd() can lead to execution of the following code

            my_time.cc

              /*
                Fix that if we are in the non existing daylight saving time hour
                we move the start of the next real hour.
             
                This code doesn't handle such exotical thing as time-gaps whose length
                is more than one hour or non-integer (latter can theoretically happen
                if one of seconds will be removed due leap correction, or because of
                general time correction like it happened for Africa/Monrovia time zone
                in year 1972).
              */
              if (loop == 2 && t->hour != (uint) l_time->tm_hour)
              {
                int days= t->day - l_time->tm_mday;
                if (days < -1)
                  days=1;					/* Month has wrapped */
                else if (days > 1)
                  days= -1;
                diff=(3600L*(long) (days*24+((int) t->hour - (int) l_time->tm_hour))+
            	  (long) (60*((int) t->minute - (int) l_time->tm_min)) +
                      (long) ((int) t->second - (int) l_time->tm_sec));
                if (diff == 3600)
                  tmp+=3600 - t->minute*60 - t->second;	/* Move to next hour */
                else if (diff == -3600)
                  tmp-=t->minute*60 + t->second;		/* Move to previous hour */
             
                *error_code= ER_WARN_INVALID_TIMESTAMP;
              }
            

            oleg.smirnov Oleg Smirnov added a comment - There are test failures looking like main.sargable_date_cond w3 [ fail ] Test ended at 2023-03-16 04:35:43   CURRENT_TEST: main.sargable_date_cond --- /usr/share/mysql/mysql-test/main/sargable_date_cond.result 2023-03-16 01:36:33.000000000 -0400 +++ /dev/shm/var/3/log/sargable_date_cond.reject 2023-03-16 04:35:42.953159711 -0400 @@ -1919,6 +1919,8 @@ timestampadd(hour, B.a, date_add('2016-01-01', interval A.a*8 day)), date_add('2016-01-01', interval A.a*7 day) from t1 A, t0 B; +Warnings: +Warning 1299 Invalid TIMESTAMP value in column 'a' at row 11 that occur only on some machines and not on others. Turns out it depends on whether the system time zone employs daylight saving time. If it does then calling timestampadd() can lead to execution of the following code my_time.cc /* Fix that if we are in the non existing daylight saving time hour we move the start of the next real hour.   This code doesn't handle such exotical thing as time-gaps whose length is more than one hour or non-integer (latter can theoretically happen if one of seconds will be removed due leap correction, or because of general time correction like it happened for Africa/Monrovia time zone in year 1972). */ if (loop == 2 && t->hour != (uint) l_time->tm_hour) { int days= t->day - l_time->tm_mday; if (days < -1) days=1; /* Month has wrapped */ else if (days > 1) days= -1; diff=(3600L*(long) (days*24+((int) t->hour - (int) l_time->tm_hour))+ (long) (60*((int) t->minute - (int) l_time->tm_min)) + (long) ((int) t->second - (int) l_time->tm_sec)); if (diff == 3600) tmp+=3600 - t->minute*60 - t->second; /* Move to next hour */ else if (diff == -3600) tmp-=t->minute*60 + t->second; /* Move to previous hour */   *error_code= ER_WARN_INVALID_TIMESTAMP; }
            psergei Sergei Petrunia added a comment - https://github.com/MariaDB/server/tree/preview-11.1-mdev-8320

            At the moment, the task has been tested with cases for the SELECT, a fix of MDEV-30946 is needed to continue testing

            lstartseva Lena Startseva added a comment - At the moment, the task has been tested with cases for the SELECT, a fix of MDEV-30946 is needed to continue testing

            Testing done. Ok to push

            lstartseva Lena Startseva added a comment - Testing done. Ok to push
            oleg.smirnov Oleg Smirnov added a comment -

            Pushed to 11.1

            oleg.smirnov Oleg Smirnov added a comment - Pushed to 11.1

            People

              oleg.smirnov Oleg Smirnov
              jkavalik Jiri Kavalik
              Votes:
              5 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.