|
In the function remove_duplicates st_join_table::remove_duplicates
List<Item> *fields= (this)->fields;
|
(gdb) p *this->fields
|
$29 = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x7fffd4013ed8, last = 0x7fffd4013fe0, elements = 2}, <No data fields>}
|
(gdb) p *(this-1)->fields
|
$27 = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x555556dbd9d8 <vtable for Window_spec+16>, last = 0xa5a5a5a5a5a5a501, elements = 0}, <No data fields>}
|
(gdb) p this->fields
|
$28 = (List<Item> *) 0x7fffd4013648
|
List_iterator<Item> it(*fields);
|
(gdb) p it
|
$20 = {<base_list_iterator> = {list = 0x7fffd4012660, el = 0x7fffd4012660, prev = 0x0, current = 0x0}, <No data fields>}
|
So here I see that we are referring to the incorrect fields by using (this-1)->fields
It looks that this->fields has the fields
|
|
Varun please review
|
|
http://lists.askmonty.org/pipermail/commits/2017-July/011312.html
|
|
Still getting a crash with the same stack trace on bb-10.2-mdev13355:
|
4bdf9318b6bab1bf8d3718c50f87871a4ae5ec9b
|
#3 <signal handler called>
|
#4 0x00007fce9ade0390 in st_join_table::remove_duplicates (this=0x7fce80013ac0) at /data/src/bb-10.2-mdev13355/sql/sql_select.cc:21792
|
#7 0x00007fce9add81f4 in sub_select_postjoin_aggr (join=0x7fce80013198, join_tab=0x7fce80013ac0, end_of_records=true) at /data/src/bb-10.2-mdev13355/sql/sql_select.cc:18270
|
#8 0x00007fce9add7d30 in do_select (join=0x7fce80013198, procedure=0x0) at /data/src/bb-10.2-mdev13355/sql/sql_select.cc:18101
|
#9 0x00007fce9adb2589 in JOIN::exec_inner (this=0x7fce80013198) at /data/src/bb-10.2-mdev13355/sql/sql_select.cc:3484
|
#10 0x00007fce9adb1a38 in JOIN::exec (this=0x7fce80013198) at /data/src/bb-10.2-mdev13355/sql/sql_select.cc:3279
|
#11 0x00007fce9adb2c01 in mysql_select (thd=0x7fce80000b00, tables=0x7fce800129f0, wild_num=0, fields=..., conds=0x7fce80013020, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748609, result=0x7fce80013178, unit=0x7fce800045e8, select_lex=0x7fce80004d20) at /data/src/bb-10.2-mdev13355/sql/sql_select.cc:3679
|
#12 0x00007fce9ada750c in handle_select (thd=0x7fce80000b00, lex=0x7fce80004520, result=0x7fce80013178, setup_tables_done_option=0) at /data/src/bb-10.2-mdev13355/sql/sql_select.cc:373
|
#13 0x00007fce9ad733ea in execute_sqlcom_select (thd=0x7fce80000b00, all_tables=0x7fce800129f0) at /data/src/bb-10.2-mdev13355/sql/sql_parse.cc:6443
|
#14 0x00007fce9ad69429 in mysql_execute_command (thd=0x7fce80000b00) at /data/src/bb-10.2-mdev13355/sql/sql_parse.cc:3458
|
#15 0x00007fce9ad76daa in mysql_parse (thd=0x7fce80000b00, rawbuf=0x7fce80012378 "SELECT DISTINCT MAX(i), ROW_NUMBER() OVER () FROM t1 WHERE 0", length=60, parser_state=0x7fce92960200, is_com_multi=false, is_next_command=false) at /data/src/bb-10.2-mdev13355/sql/sql_parse.cc:7879
|
#16 0x00007fce9ad64e56 in dispatch_command (command=COM_QUERY, thd=0x7fce80000b00, packet=0x7fce80163bd1 "", packet_length=60, is_com_multi=false, is_next_command=false) at /data/src/bb-10.2-mdev13355/sql/sql_parse.cc:1817
|
#17 0x00007fce9ad63797 in do_command (thd=0x7fce80000b00) at /data/src/bb-10.2-mdev13355/sql/sql_parse.cc:1362
|
#18 0x00007fce9aeaf549 in do_handle_one_connection (connect=0x7fce9e6e4a50) at /data/src/bb-10.2-mdev13355/sql/sql_connect.cc:1354
|
#19 0x00007fce9aeaf2d6 in handle_one_connection (arg=0x7fce9e6e4a50) at /data/src/bb-10.2-mdev13355/sql/sql_connect.cc:1260
|
#20 0x00007fce9b1f6984 in pfs_spawn_thread (arg=0x7fce9e6c7620) at /data/src/bb-10.2-mdev13355/storage/perfschema/pfs.cc:1862
|
#21 0x00007fce9a386064 in start_thread (arg=0x7fce92961700) at pthread_create.c:309
|
#22 0x00007fce9856b62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
|
CREATE TABLE t1 (i INT);
|
INSERT INTO t1 VALUES (1),(2);
|
SELECT DISTINCT MAX(i), ROW_NUMBER() OVER () FROM t1 WHERE 0;
|
Here is also a variation of the test case with a less obvious degenerative condition, might be worth checking too; stack trace is the same.
CREATE TABLE t1 (i INT NOT NULL);
|
INSERT INTO t1 VALUES (1),(2);
|
SELECT DISTINCT MAX(i), ROW_NUMBER() OVER () FROM t1 WHERE i IS NULL;
|
Both test cases crash with InnoDB and MyISAM.
|
|
Link to the patch:https://github.com/MariaDB/server/commit/bd45c4377329bd0873ee64a5e3f8beca7d8e7c49?diff=unified
obvious flaws in the patch:
- It mentions BUG#13353 while this bug is 13352.
- The MTR testcase doesn't print "MDEV-nnnn" like other testscases do
- Running mysql-test-run t/win.test after the patch fails like so
--- /home/psergey/dev-git/10.2-r5/mysql-test/r/win.result 2017-12-26 13:32:38.182427886 +0300
|
+++ /home/psergey/dev-git/10.2-r5/mysql-test/r/win.reject 2017-12-26 14:12:48.455733684 +0300
|
@@ -2782,7 +2782,7 @@
|
NULL NULL
|
select max(a), row_number() over () from t1 where 1 = 2;
|
max(a) row_number() over ()
|
-NULL 1
|
+NULL NULL
|
select max(a), sum(max(a)) over () from t1 where 1 = 2;
|
max(a) sum(max(a)) over ()
|
NULL NULL
|
This is easily reproducible
|
|
(this is not immediately clear from comments so I'll mention it explicitly: my patch has been pushed, but apparently, there were cases that it did not fix.
Next time, please open another issue and make it related).
Vesa's patch builds on top of mine. It fixes Elena's crash but causes regression in another testcase in win.test.
|
|
Thoughts:
(Elena's crash from July,21)
- remove_duplicates() assumes that there is a computation step done before the duplicate removal.
- If the query has "Impossible where", there is no previous step, that's why we get a crash.
- Q: why do we have duplicate removal step if we know there will be just one row? We could avoid it.
Wrong result after Vesa's patch
- when the query has zero_result_cause="impossible WHERE", we call return_zero_rows()
- which may still check HAVING and produce a result row, when appropriate.
- It doesn't compute window function values though. This is wrong. When "impossible where" produced one "NULL-row" after grouping, then we need Window Functions computation step to process this row and compute window function value.
- (Perhaps we need some special way to compute window functions that we will invoke in return_zero_rows? I am not sure if this makes sense.)
|
|
Proposed alternative fix:
diff --git a/sql/sql_select.cc b/sql/sql_select.cc
|
index 676b26e..e82e130 100644
|
--- a/sql/sql_select.cc
|
+++ b/sql/sql_select.cc
|
@@ -2654,7 +2654,15 @@ bool JOIN::make_aggr_tables_info()
|
curr_tab->having= having;
|
having->update_used_tables();
|
}
|
- curr_tab->distinct= true;
|
+ /*
|
+ We only need DISTINCT operation if the join is not degenerate.
|
+ If it is, we must not request DISTINCT processing, because
|
+ remove_duplicates() assumes there is a preceding computation step (and
|
+ in the degenerate join, there's none)
|
+ */
|
+ if (top_join_tab_count)
|
+ curr_tab->distinct= true;
|
+
|
having= NULL;
|
select_distinct= false;
|
}
|
|
|
So the problem with using distinct is that it assumes that we have a previous join_tab but with the case of having an IMPOSSIBLE WHERE. In normal cases we don't use distinct if we have an IMPOSSIBLE where clause. With window functions we do the execution so we have to check for distinct too and then we start referring to a previous tab which is not valid.
/* Number of tables actually joined at the top level */
|
uint exec_join_tab_cnt() { return tables_list ? top_join_tab_count : 0; }
|
curr_tab= join_tab + exec_join_tab_cnt();
|
So I agree with the patch psergey posted above.
|
|
Studying the question of what kinds of JOIN_TABs exist when remove_duplicates is used:
Example #1:
select distinct max(a) from t10 group by b;
|
join_tab[0]
|
table = {t10}
|
aggr= NULL
|
distinct= false
|
|
join_tab[1] (call it $join_tab_1)
|
table= <ha_heap object, $work_table_1>
|
aggr= pointer
|
distinct= true
|
remove_duplicates() will be called from join_init_read_record($join_tab_1) call.
That is, remove_duplicates is a "preparation" operation for a join_tab.
The join_tab's "primary" read loop will read the de-duplicated rows (and pass them to query output in the above example).
|
|
Example #2
select distinct row_number() over (order by a ) from t1;
|
join_tab[0]
|
table=t1
|
distinct= false
|
window_funcs_step= NULL
|
|
join_tab[1]
|
table= <temp_table>
|
window_funcs_step=<pointer>
|
distinct= true
|
Window function computation step is also a "pre-read" operation.
When a join tab has both distinct=true and window_funcs_step, window_funcs_step is performed first.
|
|
Example #3:
select max(b), row_number() over (order by a ) from t1 where 1 = 2 ;
|
join_tab[0]
|
// join->first_select = sub_select_postjoin_aggr
|
table= <temp. table>
|
window_funcs_step = <pointer>
|
The join_tab does not have a sub_select call. Execution starts directly with sub_select_postjoin_aggr call. Inside that call, a record is written into the temp. table. Then, window function computation step is performed.
select distinct max(b), row_number() over (order by a ) from t1 where 1 = 2 ;
|
has the same data structure as above but also join_tab[0].distinct=true.
|
|
http://lists.askmonty.org/pipermail/commits/2018-January/011841.html
|
|
As discussed on the previous optimizer call: Igor, please review.
|
|
Sergey,
It's OK to push your patch.
|
|
Fix pushed.
|