[MDEV-8320] Allow index usage for DATE(datetime_column) = const - Jira

Daniel Black added a comment - 2015-06-16 09:35

any hints where to start looking in the code to implement this appreciated.

Daniel Black added a comment - 2015-06-16 09:35 any hints where to start looking in the code to implement this appreciated.

Sergei Golubchik added a comment - 2015-06-17 13:25

I'd look first at add_key_fields(). But perhaps I would've been wrong

Sergei Golubchik added a comment - 2015-06-17 13:25 I'd look first at add_key_fields() . But perhaps I would've been wrong

Sergei Petrunia added a comment - 2015-06-17 18:46

add_key_fields() is an auxiliary function that used for constructing ref access plans (here, ref includes ref_or_null, or eq_ref). ref access plans are based on equalities (or other conditions that can be reduced to equalities), such that the records can be read with handler->index_read()/handler->index_next_same() calls.

YEAR(date_col) = const specifies a range. So, the right way would be to use range optimizer to handle this.

Sergei Petrunia added a comment - 2015-06-17 18:46 add_key_fields() is an auxiliary function that used for constructing ref access plans (here, ref includes ref_or_null , or eq_ref ). ref access plans are based on equalities (or other conditions that can be reduced to equalities), such that the records can be read with handler->index_read()/handler->index_next_same() calls. YEAR(date_col) = const specifies a range. So, the right way would be to use range optimizer to handle this.

Sergei Petrunia added a comment - 2015-06-17 18:50 - edited

As for using range optimizer... I'll debug a similar query and point places where changes are be made.

Let the query be:

explain select * from t100 where a between '2015-10-10' and '2015-10-12';

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

|    1 | SIMPLE      | t100  | range | a             | a    | 4       | NULL |    3 | Using index condition |

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

The first place is:

  #0  add_key_field (join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0, cond=0x7fff5c005cf8, field=0x7fff5c010a90, eq_func=false, value=0x7fff5c005e48, num_values=2, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4482

  #1  0x0000555555ab03be in add_key_equal_fields (join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0, cond=0x7fff5c005cf8, field_item=0x7fff5c005aa8, eq_func=false, val=0x7fff5c005e48, num_values=2, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4568

  #2  0x0000555555ab0a58 in Item_func_between::add_key_fields (this=0x7fff5c005cf8, join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0x7ffff7eb5b50, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4734

  #3  0x0000555555ab2524 in update_ref_and_keys (thd=0x55555aad3700, keyuse=0x7fff5c0068a0, join_tab=0x7fff5c006e18, tables=1, cond=0x7fff5c005cf8, normal_tables=18446744073709551615, select_lex=0x55555aad77b0, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:5302

  #4  0x0000555555aad6a1 in make_join_statistics (join=0x7fff5c0065a0, tables_list=..., keyuse_array=0x7fff5c0068a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3615

  #5  0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341

  #6  0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023

This is the preparation stage. We collect in TABLE::const_keys a set of indexes for which range optimization is probably useful. Note the frame with Item_func_between::add_key_fields, I guess Item_func_eq will need a similar function that will check if it is comparing a YEAR(key_col) with a constant

Sergei Petrunia added a comment - 2015-06-17 18:50 - edited As for using range optimizer... I'll debug a similar query and point places where changes are be made. Let the query be: explain select * from t100 where a between '2015-10-10' and '2015-10-12'; +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | 1 | SIMPLE | t100 | range | a | a | 4 | NULL | 3 | Using index condition | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ The first place is: #0 add_key_field (join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0, cond=0x7fff5c005cf8, field=0x7fff5c010a90, eq_func=false, value=0x7fff5c005e48, num_values=2, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4482 #1 0x0000555555ab03be in add_key_equal_fields (join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0, cond=0x7fff5c005cf8, field_item=0x7fff5c005aa8, eq_func=false, val=0x7fff5c005e48, num_values=2, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4568 #2 0x0000555555ab0a58 in Item_func_between::add_key_fields (this=0x7fff5c005cf8, join=0x7fff5c0065a0, key_fields=0x7ffff7eb5b18, and_level=0x7ffff7eb5b50, usable_tables=18446744073709551615, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4734 #3 0x0000555555ab2524 in update_ref_and_keys (thd=0x55555aad3700, keyuse=0x7fff5c0068a0, join_tab=0x7fff5c006e18, tables=1, cond=0x7fff5c005cf8, normal_tables=18446744073709551615, select_lex=0x55555aad77b0, sargables=0x7ffff7eb5c88) at /home/psergey/dev-git/10.1/sql/sql_select.cc:5302 #4 0x0000555555aad6a1 in make_join_statistics (join=0x7fff5c0065a0, tables_list=..., keyuse_array=0x7fff5c0068a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3615 #5 0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341 #6 0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023 This is the preparation stage. We collect in TABLE::const_keys a set of indexes for which range optimization is probably useful. Note the frame with Item_func_between::add_key_fields , I guess Item_func_eq will need a similar function that will check if it is comparing a YEAR(key_col) with a constant

Sergei Petrunia added a comment - 2015-06-17 18:57

Construction of range objects happens here:

  #0  SEL_ARG::SEL_ARG (this=0x7fff5c014a60, f=0x7fff5c010a90, min_value_arg=0x7fff5c014a58 "", max_value_arg=0x7fff5c014a58 "") at /home/psergey/dev-git/10.1/sql/opt_range.cc:2479

  #1  0x0000555555988784 in get_mm_leaf (param=0x7ffff7eb3410, conf_func=0x7fff5c005cf8, field=0x7fff5c010a90, key_part=0x7fff5c0147b8, type=Item_func::GE_FUNC, value=0x7fff5c005bb8) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8598

  #2  0x00005555559878c9 in get_mm_parts (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field=0x7fff5c010a90, type=Item_func::GE_FUNC, value=0x7fff5c005bb8, cmp_type=TIME_RESULT) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8280

  #3  0x0000555555985b0b in get_func_mm_tree (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field=0x7fff5c010a90, value=0x0, cmp_type=TIME_RESULT, inv=false) at /home/psergey/dev-git/10.1/sql/opt_range.cc:7604

  #4  0x000055555598652d in get_full_func_mm_tree (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field_item=0x7fff5c005aa8, value=0x0, inv=false) at /home/psergey/dev-git/10.1/sql/opt_range.cc:7931

  #5  0x0000555555986de0 in Item_func_between::get_mm_tree (this=0x7fff5c005cf8, param=0x7ffff7eb3410, cond_ptr=0x7fff5c007eb0) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8117

  #6  0x000055555597c083 in SQL_SELECT::test_quick_select (this=0x7fff5c007ea8, thd=0x55555aad3700, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false, remove_false_parts_of_where=true) at /home/psergey/dev-git/10.1/sql/opt_range.cc:3133

  #7  0x0000555555aac9a4 in get_quick_record_count (thd=0x55555aad3700, select=0x7fff5c007ea8, table=0x7fff5c00f640, keys=0x7fff5c006f98, limit=18446744073709551615, in_outer_join=false) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3362

  #8  0x0000555555aaed53 in make_join_statistics (join=0x7fff5c0065a0, tables_list=..., keyuse_array=0x7fff5c0068a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4004

  #9  0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341

  #10 0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023

Once SEL_ARG objects are there, the rest of things (getting records_in_range() estimates, handling of AND/OR, multi-component keys) will happen automatically. The tricks is to create the right SEL_ARGs.

Sergei Petrunia added a comment - 2015-06-17 18:57 Construction of range objects happens here: #0 SEL_ARG::SEL_ARG (this=0x7fff5c014a60, f=0x7fff5c010a90, min_value_arg=0x7fff5c014a58 "", max_value_arg=0x7fff5c014a58 "") at /home/psergey/dev-git/10.1/sql/opt_range.cc:2479 #1 0x0000555555988784 in get_mm_leaf (param=0x7ffff7eb3410, conf_func=0x7fff5c005cf8, field=0x7fff5c010a90, key_part=0x7fff5c0147b8, type=Item_func::GE_FUNC, value=0x7fff5c005bb8) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8598 #2 0x00005555559878c9 in get_mm_parts (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field=0x7fff5c010a90, type=Item_func::GE_FUNC, value=0x7fff5c005bb8, cmp_type=TIME_RESULT) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8280 #3 0x0000555555985b0b in get_func_mm_tree (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field=0x7fff5c010a90, value=0x0, cmp_type=TIME_RESULT, inv=false) at /home/psergey/dev-git/10.1/sql/opt_range.cc:7604 #4 0x000055555598652d in get_full_func_mm_tree (param=0x7ffff7eb3410, cond_func=0x7fff5c005cf8, field_item=0x7fff5c005aa8, value=0x0, inv=false) at /home/psergey/dev-git/10.1/sql/opt_range.cc:7931 #5 0x0000555555986de0 in Item_func_between::get_mm_tree (this=0x7fff5c005cf8, param=0x7ffff7eb3410, cond_ptr=0x7fff5c007eb0) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8117 #6 0x000055555597c083 in SQL_SELECT::test_quick_select (this=0x7fff5c007ea8, thd=0x55555aad3700, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false, remove_false_parts_of_where=true) at /home/psergey/dev-git/10.1/sql/opt_range.cc:3133 #7 0x0000555555aac9a4 in get_quick_record_count (thd=0x55555aad3700, select=0x7fff5c007ea8, table=0x7fff5c00f640, keys=0x7fff5c006f98, limit=18446744073709551615, in_outer_join=false) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3362 #8 0x0000555555aaed53 in make_join_statistics (join=0x7fff5c0065a0, tables_list=..., keyuse_array=0x7fff5c0068a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4004 #9 0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341 #10 0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0065a0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023 Once SEL_ARG objects are there, the rest of things (getting records_in_range() estimates, handling of AND/OR, multi-component keys) will happen automatically. The tricks is to create the right SEL_ARGs.

Sergei Petrunia added a comment - 2015-06-17 19:09 - edited

Looking at where SEL_ARG object is made for a query with equality:

 explain select * from t100 where a= '2015-10-10' ;

  #0  SEL_ARG::SEL_ARG (this=0x7fff5c015cb0, f=0x7fff5c010a90, min_value_arg=0x7fff5c015ca8 "", max_value_arg=0x7fff5c015ca8 "") at /home/psergey/dev-git/10.1/sql/opt_range.cc:2479

  #1  0x0000555555988784 in get_mm_leaf (param=0x7ffff7eb3410, conf_func=0x7fff5c006d58, field=0x7fff5c010a90, key_part=0x7fff5c015a08, type=Item_func::EQ_FUNC, value=0x7fff5c005b88) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8598

  #2  0x00005555559878c9 in get_mm_parts (param=0x7ffff7eb3410, cond_func=0x7fff5c006d58, field=0x7fff5c010a90, type=Item_func::EQ_FUNC, value=0x7fff5c005b88, cmp_type=TIME_RESULT) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8280

  #3  0x00005555559872b3 in Item_equal::get_mm_tree (this=0x7fff5c006d58, param=0x7ffff7eb3410, cond_ptr=0x7fff5c008078) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8194

  #4  0x000055555597c083 in SQL_SELECT::test_quick_select (this=0x7fff5c008070, thd=0x55555aad3700, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false, remove_false_parts_of_where=true) at /home/psergey/dev-git/10.1/sql/opt_range.cc:3133

  #5  0x0000555555aac9a4 in get_quick_record_count (thd=0x55555aad3700, select=0x7fff5c008070, table=0x7fff5c00f640, keys=0x7fff5c007128, limit=18446744073709551615, in_outer_join=false) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3362

  #6  0x0000555555aaed53 in make_join_statistics (join=0x7fff5c0064e0, tables_list=..., keyuse_array=0x7fff5c0067e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4004

  #7  0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0064e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341

  #8  0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0064e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023

It looks like get_mm_leaf is the function which should have the logic that figures that

YEAR(key_col) = C1  -->   " concat(C1, '-01-01') <=  key_col  <= concat(C1, '-12-31')  "

Sergei Petrunia added a comment - 2015-06-17 19:09 - edited Looking at where SEL_ARG object is made for a query with equality: explain select * from t100 where a= '2015-10-10' ; #0 SEL_ARG::SEL_ARG (this=0x7fff5c015cb0, f=0x7fff5c010a90, min_value_arg=0x7fff5c015ca8 "", max_value_arg=0x7fff5c015ca8 "") at /home/psergey/dev-git/10.1/sql/opt_range.cc:2479 #1 0x0000555555988784 in get_mm_leaf (param=0x7ffff7eb3410, conf_func=0x7fff5c006d58, field=0x7fff5c010a90, key_part=0x7fff5c015a08, type=Item_func::EQ_FUNC, value=0x7fff5c005b88) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8598 #2 0x00005555559878c9 in get_mm_parts (param=0x7ffff7eb3410, cond_func=0x7fff5c006d58, field=0x7fff5c010a90, type=Item_func::EQ_FUNC, value=0x7fff5c005b88, cmp_type=TIME_RESULT) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8280 #3 0x00005555559872b3 in Item_equal::get_mm_tree (this=0x7fff5c006d58, param=0x7ffff7eb3410, cond_ptr=0x7fff5c008078) at /home/psergey/dev-git/10.1/sql/opt_range.cc:8194 #4 0x000055555597c083 in SQL_SELECT::test_quick_select (this=0x7fff5c008070, thd=0x55555aad3700, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false, ordered_output=false, remove_false_parts_of_where=true) at /home/psergey/dev-git/10.1/sql/opt_range.cc:3133 #5 0x0000555555aac9a4 in get_quick_record_count (thd=0x55555aad3700, select=0x7fff5c008070, table=0x7fff5c00f640, keys=0x7fff5c007128, limit=18446744073709551615, in_outer_join=false) at /home/psergey/dev-git/10.1/sql/sql_select.cc:3362 #6 0x0000555555aaed53 in make_join_statistics (join=0x7fff5c0064e0, tables_list=..., keyuse_array=0x7fff5c0067e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:4004 #7 0x0000555555aa5836 in JOIN::optimize_inner (this=0x7fff5c0064e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1341 #8 0x0000555555aa478c in JOIN::optimize (this=0x7fff5c0064e0) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1023 It looks like get_mm_leaf is the function which should have the logic that figures that YEAR(key_col) = C1 --> " concat(C1, '-01-01') <= key_col <= concat(C1, '-12-31') "

Sergei Petrunia added a comment - 2015-06-17 19:11

danblack, hopefully these are enough? I hope I didn't miss anything.

Sergei Petrunia added a comment - 2015-06-17 19:11 danblack , hopefully these are enough? I hope I didn't miss anything.

Daniel Black added a comment - 2015-06-18 08:41 - edited

great start, thanks psergei!

Daniel Black added a comment - 2015-06-18 08:41 - edited great start, thanks psergei !

Jiri Kavalik added a comment - 2015-07-17 22:58

Thank you all for hints. I tried to implement this in https://github.com/jkavalik/server/commit/2a66fa9bde198bca78625210f632112586c818e7

No idea if it is the right way to do such a thing but it seems to work as I imagined. I did not yet manage to run complete test suite (had to put my laptop to sleep and it then crashed on timeout..) but I ran some (innodb, select, explain, type_date, type_datetime directly and then some 1200 from the entire suite before the sleep) without troubles.

Jiri Kavalik added a comment - 2015-07-17 22:58 Thank you all for hints. I tried to implement this in https://github.com/jkavalik/server/commit/2a66fa9bde198bca78625210f632112586c818e7 No idea if it is the right way to do such a thing but it seems to work as I imagined. I did not yet manage to run complete test suite (had to put my laptop to sleep and it then crashed on timeout..) but I ran some (innodb, select, explain, type_date, type_datetime directly and then some 1200 from the entire suite before the sleep) without troubles.

Sergei Petrunia added a comment - 2015-07-20 21:28

Commented on the patch in github. Summary: found a number of issues which need to be resolved before this can be pushed.

Sergei Petrunia added a comment - 2015-07-20 21:28 Commented on the patch in github. Summary: found a number of issues which need to be resolved before this can be pushed.

Jiri Kavalik added a comment - 2015-07-21 13:11

Thank you for the comments. I will try to work on those issues.

Jiri Kavalik added a comment - 2015-07-21 13:11 Thank you for the comments. I will try to work on those issues.

Daniel Black added a comment - 2015-07-28 10:26

Minor addition, when these optimisations occur it would be good to see an addition text in the 'Extra' of the explain output like '

{funct}

(colref) optimised to range;'

Daniel Black added a comment - 2015-07-28 10:26 Minor addition, when these optimisations occur it would be good to see an addition text in the 'Extra' of the explain output like ' {funct} (colref) optimised to range;'

Sergei Petrunia added a comment - 2015-08-12 20:48

Saw a customer case with conditions like this:

DATE(table.datetime_col) BETWEEN '2015-08-12' AND '2015-08-12'

DATE(table.date_col) BETWEEN '2015-08-12' AND '2015-08-12'

The columns have types DATE and DATETIME.

Sergei Petrunia added a comment - 2015-08-12 20:48 Saw a customer case with conditions like this: DATE(table.datetime_col) BETWEEN '2015-08-12' AND '2015-08-12' DATE(table.date_col) BETWEEN '2015-08-12' AND '2015-08-12' The columns have types DATE and DATETIME .

Sergei Petrunia added a comment - 2016-06-23 21:01

Saw another case where the real-world queries are using conditions in form:

DATE(date_col) = '2016-06-23'

Sometimes these conditions are OR-ed together. The comparison is always with a constant.
It could be that date_col is actually a DATETIME.

Sergei Petrunia added a comment - 2016-06-23 21:01 Saw another case where the real-world queries are using conditions in form: DATE(date_col) = '2016-06-23' Sometimes these conditions are OR-ed together. The comparison is always with a constant. It could be that date_col is actually a DATETIME.

Alexey Botchkov added a comment - 2016-09-20 09:37

The patch v0. Makes DATE(date_col) <>= CONST_DATE sargable.
http://lists.askmonty.org/pipermail/commits/2016-September/009874.html

Alexey Botchkov added a comment - 2016-09-20 09:37 The patch v0. Makes DATE(date_col) <>= CONST_DATE sargable. http://lists.askmonty.org/pipermail/commits/2016-September/009874.html

Alexey Botchkov added a comment - 2016-09-20 09:37

http://lists.askmonty.org/pipermail/commits/2016-September/009874.html

Alexey Botchkov added a comment - 2016-09-20 09:37 http://lists.askmonty.org/pipermail/commits/2016-September/009874.html

Michaël de groot added a comment - 2016-09-22 20:36

When this is done, please don't forget to update this kb article: https://mariadb.com/kb/en/mariadb/building-the-best-index-for-a-given-select/

Michaël de groot added a comment - 2016-09-22 20:36 When this is done, please don't forget to update this kb article: https://mariadb.com/kb/en/mariadb/building-the-best-index-for-a-given-select/

Alexey Botchkov added a comment - 2016-09-28 11:06

Test added.
http://lists.askmonty.org/pipermail/commits/2016-September/009935.html

Alexey Botchkov added a comment - 2016-09-28 11:06 Test added. http://lists.askmonty.org/pipermail/commits/2016-September/009935.html

Michaël de groot added a comment - 2017-05-30 08:17

This issue is in review for a while now, any status update?

Thanks!

Michaël de groot added a comment - 2017-05-30 08:17 This issue is in review for a while now, any status update? Thanks!

Sergei Petrunia added a comment - 2017-11-16 05:23 - edited

So the patch works when the WHERE clause has just the condition we are
targeting:

explain select * from t1 where date(a) < '2017-01-10' ;

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

|    1 | SIMPLE      | t1    | range | a             | a    | 6       | NULL |   90 | Using index condition |

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

but doesn't work with equality propagation:

explain select * from t1 where date(a1) < '2017-01-10' and a1=a;

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 10122 | Using where |

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

Actually, even irrelevant equalities make the patch not to work. Let's add
"b=b1" to the first query and we get:

explain select * from t1 where date(a) < '2017-01-10' and b=b1;

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 10122 | Using where |

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

Another thing: the patch works only if DATE(col) is on the left side of the equality:

explain select * from t1 where date(a) = '2017-01-10';

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

|    1 | SIMPLE      | t1    | range | a             | a    | 6       | NULL |   10 | Using index condition |

+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

explain select * from t1 where '2017-01-10'=date(a);

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

|    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 10122 | Using where |

+------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

Sergei Petrunia added a comment - 2017-11-16 05:23 - edited So the patch works when the WHERE clause has just the condition we are targeting: explain select * from t1 where date(a) < '2017-01-10' ; +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | 1 | SIMPLE | t1 | range | a | a | 6 | NULL | 90 | Using index condition | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ but doesn't work with equality propagation: explain select * from t1 where date(a1) < '2017-01-10' and a1=a; +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10122 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ Actually, even irrelevant equalities make the patch not to work. Let's add "b=b1" to the first query and we get: explain select * from t1 where date(a) < '2017-01-10' and b=b1; +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10122 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ Another thing: the patch works only if DATE(col) is on the left side of the equality: explain select * from t1 where date(a) = '2017-01-10'; +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ | 1 | SIMPLE | t1 | range | a | a | 6 | NULL | 10 | Using index condition | +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+ explain select * from t1 where '2017-01-10'=date(a); +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10122 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+

Sergei Petrunia added a comment - 2017-11-16 06:17 - edited

Elaborating on Monty's "do a rewrite at fix_fields" proposal.

Let's consider YEAR(a), where a is a DATETIME column.

introduce functions:

YEAR_START(datetime_col) returns CONCAT(YEAR(datetime_col), '-01-01 00:00:00')

YEAR_END(datetime_col) returns CONCAT(YEAR(datetime_col), '-31-12 23:59:59 ...')

(add more precision if necessary)

then

YEAR(col) <= val  is the same as  col <= YEAR_END(val)

YEAR(col) < val   is the same as  col < YEAR_START(val)

YEAR(col) >= val  is the same as  col >= YEAR_START(val)

YEAR(col) > val   is the same as  col > YEAR_END(val)

YEAR(col) = val   is the same as  col>=YEAR_START(val) AND col<=YEAR_END(val)

YEAR(col) BETWEEN val1 AND val2

is the same as

YEAR(col) >= val2 AND YEAR(col) <= val2

which reduces it to the cases already handled above.

If we use the rewrite approach, equality propagation will work automatically.

Sergei Petrunia added a comment - 2017-11-16 06:17 - edited Elaborating on Monty's "do a rewrite at fix_fields" proposal. Let's consider YEAR(a), where a is a DATETIME column. introduce functions: YEAR_START(datetime_col) returns CONCAT(YEAR(datetime_col), '-01-01 00:00:00') YEAR_END(datetime_col) returns CONCAT(YEAR(datetime_col), '-31-12 23:59:59 ...') (add more precision if necessary) then YEAR(col) <= val is the same as col <= YEAR_END(val) YEAR(col) < val is the same as col < YEAR_START(val) YEAR(col) >= val is the same as col >= YEAR_START(val) YEAR(col) > val is the same as col > YEAR_END(val) YEAR(col) = val is the same as col>=YEAR_START(val) AND col<=YEAR_END(val) YEAR(col) BETWEEN val1 AND val2 is the same as YEAR(col) >= val2 AND YEAR(col) <= val2 which reduces it to the cases already handled above. If we use the rewrite approach, equality propagation will work automatically.

Sergei Petrunia added a comment - 2017-11-20 07:48

I'm working on a patch to implement the new proposal.

Sergei Petrunia added a comment - 2017-11-20 07:48 I'm working on a patch to implement the new proposal.

Sergei Petrunia added a comment - 2017-11-23 14:47

A patch: http://lists.askmonty.org/pipermail/commits/2017-November/011674.html
Still need to figure out if we need to take measures to account for leap seconds.

Sergei Petrunia added a comment - 2017-11-23 14:47 A patch: http://lists.askmonty.org/pipermail/commits/2017-November/011674.html Still need to figure out if we need to take measures to account for leap seconds.

Sergei Petrunia added a comment - 2017-11-23 14:50

Somehow, I am unable to observe this leap second.
I get Ubuntu 16.04.3 LTS, compile mysql-5.7, get the datadir from mysql-test/var/install.db, with non-empty mysql.time_zone* tables, and try to reproduce what is said at: https://dev.mysql.com/doc/refman/5.7/en/time-zone-leap-seconds.html.

The docs say:

mysql> -- values differ internally but display the same

mysql> SELECT a, ts, UNIX_TIMESTAMP(ts) FROM t1;

+------+---------------------+--------------------+

| a    | ts                  | UNIX_TIMESTAMP(ts) |

+------+---------------------+--------------------+

|    1 | 2008-12-31 23:59:59 |         1230767999 |

|    2 | 2008-12-31 23:59:59 |         1230768000 |

+------+---------------------+--------------------+

2 rows in set (0.00 sec)

while I get:

mysql> SELECT a, ts, UNIX_TIMESTAMP(ts) FROM t1;

+------+---------------------+--------------------+

| a    | ts                  | UNIX_TIMESTAMP(ts) |

+------+---------------------+--------------------+

|    1 | 2008-12-31 23:59:59 |         1230767999 |

|    2 | 2009-01-01 00:00:00 |         1230768000 |

+------+---------------------+--------------------+

2 rows in set (0.00 sec)

Sergei Petrunia added a comment - 2017-11-23 14:50 Somehow, I am unable to observe this leap second. I get Ubuntu 16.04.3 LTS, compile mysql-5.7, get the datadir from mysql-test/var/install.db, with non-empty mysql.time_zone* tables, and try to reproduce what is said at: https://dev.mysql.com/doc/refman/5.7/en/time-zone-leap-seconds.html . The docs say: mysql> -- values differ internally but display the same mysql> SELECT a, ts, UNIX_TIMESTAMP(ts) FROM t1; +------+---------------------+--------------------+ | a | ts | UNIX_TIMESTAMP(ts) | +------+---------------------+--------------------+ | 1 | 2008-12-31 23:59:59 | 1230767999 | | 2 | 2008-12-31 23:59:59 | 1230768000 | +------+---------------------+--------------------+ 2 rows in set (0.00 sec) while I get: mysql> SELECT a, ts, UNIX_TIMESTAMP(ts) FROM t1; +------+---------------------+--------------------+ | a | ts | UNIX_TIMESTAMP(ts) | +------+---------------------+--------------------+ | 1 | 2008-12-31 23:59:59 | 1230767999 | | 2 | 2009-01-01 00:00:00 | 1230768000 | +------+---------------------+--------------------+ 2 rows in set (0.00 sec)

Sergei Petrunia added a comment - 2017-11-24 12:34

Also discussed with bar.

Takeaways:

one needs to have TZ data loaded into the server in order to observe the leap second.
- Check mysql_tzinfo_to_sql, and --leap parameter.
- Note that not all timezone files have leap second information (use "file" to see which do)
DATE[TIME] type has no leap seconds. only TIMESTAMP does
TIMESTAMP has open bugs like ~~MDEV-13995~~
- It's better to not have this patch support TIMESTAMP for now. Let's do it just for DATE TIMEs.
- This means we dont have to support leap seconds.

Sergei Petrunia added a comment - 2017-11-24 12:34 Also discussed with bar . Takeaways: one needs to have TZ data loaded into the server in order to observe the leap second. Check mysql_tzinfo_to_sql, and --leap parameter. Note that not all timezone files have leap second information (use "file" to see which do) DATE [TIME] type has no leap seconds. only TIMESTAMP does TIMESTAMP has open bugs like MDEV-13995 It's better to not have this patch support TIMESTAMP for now. Let's do it just for DATE TIMEs. This means we dont have to support leap seconds.

Sergei Petrunia added a comment - 2017-11-28 13:35

Adjusted the patch according to the above: http://lists.askmonty.org/pipermail/commits/2017-November/011680.html

A test patch to also make rewrite function a member of Item: http://lists.askmonty.org/pipermail/commits/2017-November/011681.html

Sergei Petrunia added a comment - 2017-11-28 13:35 Adjusted the patch according to the above: http://lists.askmonty.org/pipermail/commits/2017-November/011680.html A test patch to also make rewrite function a member of Item: http://lists.askmonty.org/pipermail/commits/2017-November/011681.html

Sergei Petrunia added a comment - 2017-11-28 13:36

bar, please review.

Sergei Petrunia added a comment - 2017-11-28 13:36 bar , please review.

Oleg Smirnov added a comment - 2023-03-17 07:42

There are test failures looking like

main.sargable_date_cond                  w3 [ fail ]

        Test ended at 2023-03-16 04:35:43

CURRENT_TEST: main.sargable_date_cond

--- /usr/share/mysql/mysql-test/main/sargable_date_cond.result	2023-03-16 01:36:33.000000000 -0400

+++ /dev/shm/var/3/log/sargable_date_cond.reject	2023-03-16 04:35:42.953159711 -0400

@@ -1919,6 +1919,8 @@

 timestampadd(hour, B.a, date_add('2016-01-01', interval A.a*8 day)),

 date_add('2016-01-01', interval A.a*7 day)

 from t1 A, t0 B;

+Warnings:

+Warning	1299	Invalid TIMESTAMP value in column 'a' at row 11

that occur only on some machines and not on others. Turns out it depends on whether the system time zone employs daylight saving time.

If it does then calling timestampadd() can lead to execution of the following code

my_time.cc
/*
Fix that if we are in the non existing daylight saving time hour
we move the start of the next real hour.

This code doesn't handle such exotical thing as time-gaps whose length
is more than one hour or non-integer (latter can theoretically happen
if one of seconds will be removed due leap correction, or because of
general time correction like it happened for Africa/Monrovia time zone
in year 1972).
*/
if (loop == 2 && t->hour != (uint) l_time->tm_hour)
{
int days= t->day - l_time->tm_mday;
if (days < -1)
days=1; /* Month has wrapped */
else if (days > 1)
days= -1;
diff=(3600L(long) (days24+((int) t->hour - (int) l_time->tm_hour))+
(long) (60*((int) t->minute - (int) l_time->tm_min)) +
(long) ((int) t->second - (int) l_time->tm_sec));
if (diff == 3600)
tmp+=3600 - t->minute60 - t->second; / Move to next hour */
else if (diff == -3600)
tmp-=t->minute60 + t->second; / Move to previous hour */

*error_code= ER_WARN_INVALID_TIMESTAMP;
}

Oleg Smirnov added a comment - 2023-03-17 07:42 There are test failures looking like main.sargable_date_cond w3 [ fail ] Test ended at 2023-03-16 04:35:43 CURRENT_TEST: main.sargable_date_cond --- /usr/share/mysql/mysql-test/main/sargable_date_cond.result 2023-03-16 01:36:33.000000000 -0400 +++ /dev/shm/var/3/log/sargable_date_cond.reject 2023-03-16 04:35:42.953159711 -0400 @@ -1919,6 +1919,8 @@ timestampadd(hour, B.a, date_add('2016-01-01', interval A.a*8 day)), date_add('2016-01-01', interval A.a*7 day) from t1 A, t0 B; +Warnings: +Warning 1299 Invalid TIMESTAMP value in column 'a' at row 11 that occur only on some machines and not on others. Turns out it depends on whether the system time zone employs daylight saving time. If it does then calling timestampadd() can lead to execution of the following code my_time.cc /* Fix that if we are in the non existing daylight saving time hour we move the start of the next real hour. This code doesn't handle such exotical thing as time-gaps whose length is more than one hour or non-integer (latter can theoretically happen if one of seconds will be removed due leap correction, or because of general time correction like it happened for Africa/Monrovia time zone in year 1972). */ if (loop == 2 && t->hour != (uint) l_time->tm_hour) { int days= t->day - l_time->tm_mday; if (days < -1) days=1; /* Month has wrapped */ else if (days > 1) days= -1; diff=(3600L*(long) (days*24+((int) t->hour - (int) l_time->tm_hour))+ (long) (60*((int) t->minute - (int) l_time->tm_min)) + (long) ((int) t->second - (int) l_time->tm_sec)); if (diff == 3600) tmp+=3600 - t->minute*60 - t->second; /* Move to next hour */ else if (diff == -3600) tmp-=t->minute*60 + t->second; /* Move to previous hour */ *error_code= ER_WARN_INVALID_TIMESTAMP; }

Sergei Petrunia added a comment - 2023-03-20 08:59

https://github.com/MariaDB/server/tree/preview-11.1-mdev-8320

Sergei Petrunia added a comment - 2023-03-20 08:59 https://github.com/MariaDB/server/tree/preview-11.1-mdev-8320

Lena Startseva added a comment - 2023-04-04 09:38

At the moment, the task has been tested with cases for the SELECT, a fix of MDEV-30946 is needed to continue testing

Lena Startseva added a comment - 2023-04-04 09:38 At the moment, the task has been tested with cases for the SELECT, a fix of MDEV-30946 is needed to continue testing

Lena Startseva added a comment - 2023-04-25 05:06

Testing done. Ok to push

Lena Startseva added a comment - 2023-04-25 05:06 Testing done. Ok to push

Oleg Smirnov added a comment - 2023-04-25 16:08

Pushed to 11.1

Oleg Smirnov added a comment - 2023-04-25 16:08 Pushed to 11.1

MariaDB Server

Allow index usage for DATE(datetime_column) = const

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration