Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.10(EOL)
-
None
Description
Review comments for the 10.7-selectivity tree.
Request to fix the trace
commit 7b93762bb8a7ab9fbc89158db4025060531cd66e
|
Author: Monty <monty@mariadb.org>
|
Date: Sat Oct 9 16:16:10 2021 +0300
|
|
Update cost for hash and cached joins
|
This piece:
--- a/sql/opt_trace.cc
|
+++ b/sql/opt_trace.cc
|
@@ -677,6 +677,11 @@ void print_final_join_order(JOIN *join)
|
for (j= join->join_tab,i=0 ; i < join->top_join_tab_count;
|
i++, j++)
|
best_order.add_table_name(j);
|
+ best_order.end();
|
+
|
+ /* Write information about the resulting join */
|
+ Json_writer_object obj(join->thd, "best_access_method");
|
+ obj.add("cost", join->best_read);
|
}
|
makes the trace have this:
{
|
"best_join_order": ["<derived2>"],
|
"best_access_method": {
|
"cost": 2.799
|
}
|
},
|
Why do we need "best_access_method" ? I think just "cost" would be ok.
Also, rename "found_matching_rows_cost" to cond_check_cost?
Wrong code in distinct keys patch:
commit 5997f57a9d2d3a3bc30a38ccb815aceff1ce94dc
|
Author: Monty <monty@mariadb.org>
|
Date: Wed May 4 17:26:43 2022 +0300
|
|
Derived tables and union can now create distinct keys
|
|
The idea is that instead of marking all select_lex's with DISTINCT, we
|
only mark those that really need distinct result.
|
has
@@ -1677,6 +1678,13 @@ Item_in_subselect::Item_in_subselect(THD *thd, Item * left_exp,
|
Item_row(thd, static_cast<Item_row*>(left_exp));
|
func= &eq_creator;
|
init(select_lex, new (thd->mem_root) select_exists_subselect(thd, this));
|
+ select_lex->distinct= 1;
|
+ /*
|
+ If this is is 'xxx IN (SELECT ...) mark that the we are only interested in
|
+ unique values for the select
|
+ */
|
+ if (select_lex->first_inner_unit())
|
+ select_lex->first_inner_unit()->distinct= 1;
|
max_columns= UINT_MAX;
|
set_maybe_null();
|
reset(); |
touching select_lex->first_inner_unit() is incorrect!
Consider an example:
|
create table t10 (a int, b int, c int); |
insert into t10 select seq, seq, seq from seq_1_to_10; |
|
create table t11 as select * from t10; |
create table t12 as select * from t11; |
|
explain
|
select |
a in (select count(*) |
from |
t10,
|
(select a,b from t11) T |
where |
T.b=t10.b
|
group by |
t10.c
|
)
|
from
|
t10;
|
When we get to the select_lex->first_inner_unit()->distinct= 1; line, we have:
(gdb) p select_lex->master_unit()
|
$50 = (st_select_lex_unit *) 0x7fffa0019df0
|
(gdb) p select_lex->first_inner_unit()
|
$51 = (st_select_lex_unit *) 0x7fffa00188c0
|
|
(gdb) p select_lex->first_inner_unit()->first_select()->select_number
|
$53 = 3
|
|
(gdb) p select_lex->master_unit()->first_select()->select_number
|
$54 = 2
|
|
(gdb) p select_lex->first_inner_unit()->first_select()->join_list->elem(0)->alias
|
$62 = {str = 0x7fffa00181b0 "t11", length = 3}
|
That is, we're setting distinct=1 for the "(select a,b from t11)" !
maybe instead of
+ select_lex->first_inner_unit()->distinct= 1;
|
we should do
+ select_lex->master_unit()->distinct= 1;
|
If I remove the above line (in Item_in_subselect and Item_allany_subselect), I get test result changes in:
main.derived_cond_pushdown main.opt_tvc main.derived_opt main.myisam_explain_non_select_all main.table_value_constr main.cte_nonrecursive.
... Actually Item_exists_subselect::Item_exists subselect has the variant of the code I'm suggesting:
select_lex->distinct= 1;
|
select_lex->master_unit()->distinct= 1;
|
It was added by Monty'e earlier patch:
commit 5997f57a9d2d3a3bc30a38ccb815aceff1ce94dc
|
Author: Monty <monty@mariadb.org> Wed May 4 17:26:43 2022
|
Committer: Monty <monty@mariadb.org> Sun Jul 10 15:09:22 2022
|
|
Derived tables and union can now create distinct keys
|
A testcase showing new code causes wrong results: See attached _derived_unique.test.
A fix in convert_subq_to_sj won't work
Looking at the changed testcases, one could imagine that it would be reasonable to try setting select_lex->distinct=1 for the derived tables inside semi-join subqueries... alas, convert_subq_to_sj is called after the derived table was created.
(Why convert_subq_to_sj? because for mergeable semi-joins, we do know that one duplicate row of each table will be sufficient...)
Comment#2: what about UNIONs?
following the above, should we set distinct=1 for other UNION members?
Odd code in ha_partition::rndpos_time
commit 07ffb3abc1004a102c2c605c7c280913741d5d87
|
Author: Monty <monty@mariadb.org>
|
Date: Thu Aug 11 13:05:23 2022 +0300
|
|
TEMPORARY PUSH: Changing all cost calculation to be given in ms
|
+IO_AND_CPU_COST ha_partition::rndpos_time(ha_rows rows)
|
+ IO_AND_CPU_COST cost= m_file[i]->keyread_time(inx, ranges, rows_per_part,
|
+ blocks);
|
+ read_time.io+= cost.io;
|
+ read_time.cpu= cost.cpu;
|
+ } |
why is io added while cpu is not?
Attachments
Issue Links
- relates to
-
MDEV-23707 Fix condition selectivity computation for join prefixes
- Stalled
-
MDEV-29577 Rebase 10.7-selectivity tree
- Closed