Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
10.2.0-4, 10.2.0-5
Description
<SanjaByelkin> instead of creating new JOIN_TABs array (i.e. new plan) to process results in temporary table (GROUPING ORDERING)
<SanjaByelkin> just additional JOIN_TABs added at the end which represent such operation
<SanjaByelkin> so you never rewrite it
<SanjaByelkin> it is backport from MySQL
<SanjaByelkin> so you can add any number of postprocessing operations
<SanjaByelkin> and he need it for window functions
Attachments
Issue Links
Activity
-2 DERIVED t1 ALL NULL NULL NULL NULL 2 100.00 Using temporary; Using filesort
|
+2 DERIVED t1 ALL NULL NULL NULL NULL 2 100.00 Using temporary
|
All of the test failures that miss "Using filesort" that I saw are for derived tables.
No idea about the cause, so far.
I fixed the above problem removing the following code from make_aggr_tables_info():
if (!only_const_tables() &&
|
!join_tab[const_tables].table->sort.io_cache)
|
{
|
/*
|
If no IO cache exists for the first table then we are using an
|
INDEX SCAN and no filesort. Thus we should not remove the sorted
|
attribute on the INDEX SCAN.
|
*/
|
skip_sort_order= true;
|
}
|
Note: the 10.1 tree that the patch was based on already had failures in these tests:
Failing test(s): main.system_mysql_db_fix40123 main.system_mysql_db_fix50030 main.system_mysql_db_fix50117 main.openssl_6975 main.events_1
|
Spent more time reading the code..
Some observations:
- The patch is based on the revision of 10.1 that was before that I got ANALYZE statement feature to sort-of-work for GROUP/ORDER BY constructs.
- Because of the above, it is tempting to do the merge with current 10.1, but it is difficult to do that when so many test failures are present. I guess we will need to fix the failures first and then do the merge.
- MariaDB already has a subset of what this MDEV does in JOIN::pre_sort_join_tab. This will need to be removed.
igor, sorry I've missed your above comment re
I fixed the above problem removing the following code from make_aggr_tables_info()
I've just pushed a fix which I believe is a superset of your fix.
Pushed a fix for group_by.test failure. EXPLAIN in the old code showed wrong query plan.
Whether the optimizer should run filesort in such cases is another question. It does run it in current 10.1, pre-MDEV-8646 10.1, and 10.1-MDEV-8646. I am not sure if MySQL has fixed this problem. If they did, I assume we don't want to port that fix within the scope of this MDEV.
Pushed a fix for derived_opt.test There is another problem with Distinct; not being shown.
Sergey,
I pushed several fixes on January 4 2016: pull them please into your local tree.
Ok, currrent tree fails the following tests:
main.innodb_ext_key
|
main.show_explain
|
main.information_schema
|
main.show_explain_non_select
|
main.system_mysql_db_fix40123
|
main.system_mysql_db_fix50030
|
main.system_mysql_db_fix50117
|
main.union
|
main.analyze_stmt_slow_query_log
|
main.openssl_6975
|
main.myisam
|
main.limit
|
main.limit_rows_examined
|
main.func_group
|
main.myisam_explain_non_select_all
|
main.select_found
|
main.select_jcl6
|
main.subselect_cache
|
main.subselect_mat
|
main.subselect_mat_cost_bugs
|
main.subselect_sj_mat
|
main.subselect_no_exists_to_in
|
main.subselect_sj_nonmerged
|
main.user_var
|
main.analyze_format_json
|
main.analyze_stmt
|
main.delete_returning
|
main.derived_view
|
main.distinct
|
main.events_1
|
Some of these fail in the base tree also.
I've investigated the crash in func_group.test. This patch fixes it:
diff --git a/sql/sql_select.cc b/sql/sql_select.cc
|
index 1ee63cf..a16f719 100644
|
--- a/sql/sql_select.cc
|
+++ b/sql/sql_select.cc
|
@@ -19298,7 +19298,8 @@ enum_nested_loop_state
|
{
|
if (join->do_send_rows)
|
{
|
- error=join->result->send_data(*fields) ? 1 : 0;
|
+ //psergey-fix: error=join->result->send_data(*fields) ? 1 : 0;
|
+ error=join->result->send_data(*fields);
|
if (error < 0)
|
{
|
/* Duplicate row, don't count */
|
git history shows that the line
error=join->result->send_data(*fields) ? 1 : 0;
|
was introduced in
commit 33dc737f07d4a5c4d97a9f05953dc8f7c94f0f94
|
Author: Igor Babaev <igor@askmonty.org>
|
Date: Mon Dec 21 10:27:34 2015 -0800
|
|
MDEV-8646: Refactor the code for working tables
|
Both the original 10.1 tree and mysql-5.7 have just {{error = }}
The code below, if (error < 0) apparently does not make sense with the new code. igor, what do you think?
I'm also looking why union.test fails an assertion thd->status_var.memory_used != 0.
It fails in this line
EXPLAIN EXTENDED
|
SELECT * FROM t1 UNION SELECT * FROM t1
|
ORDER BY MATCH(a) AGAINST ('+abc' IN BOOLEAN MODE);
|
And can be fixed as follows:
diff --git a/sql/sql_union.cc b/sql/sql_union.cc
|
index 65b63c4..f6e4d48 100644
|
--- a/sql/sql_union.cc
|
+++ b/sql/sql_union.cc
|
@@ -1024,9 +1024,9 @@ bool st_select_lex_unit::cleanup()
|
JOIN *join;
|
if ((join= fake_select_lex->join))
|
{
|
- join->tables_list= 0;
|
- join->table_count= 0;
|
- join->top_join_tab_count= 0;
|
+ //join->tables_list= 0;
|
+ //join->table_count= 0; //psergey: this is where the problem is!
|
+ //join->top_join_tab_count= 0;
|
}
|
error|= fake_select_lex->cleanup();
|
/*
|
I don't have a lot of confidence in this change, but it seems not to introduce any new test failures. I'll continue looking at other failures.
Looking at the crash in delete_returning.test
It happens in this query:
DELETE FROM t1 ORDER BY i1 RETURNING ( SELECT i2 FROM t2 );
|
We crash when executing the subquery in the RETURNING part. We crash,
because DELETE execution first destroys the join from here:
Breakpoint 2, JOIN::cleanup (this=0x7fff980091d0, full=true) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_select.cc:11725
|
#1 0x0000555555a2f26b in JOIN::destroy (this=0x7fff980091d0) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_select.cc:3216
|
#2 0x0000555555abfd04 in st_select_lex::cleanup (this=0x7fff98007c70) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_union.cc:1165
|
#3 0x0000555555abf88c in st_select_lex_unit::cleanup (this=0x7fff98007ff8) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_union.cc:1020
|
#4 0x0000555555a5f9f9 in free_underlaid_joins (thd=0x555557fd7950, select=0x555557fdbdc0) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_select.cc:22851
|
#5 0x0000555555d7b133 in mysql_delete (thd=0x555557fd7950, table_list=0x7fff98007528, conds=0x0, order_list=0x555557fdbff8, limit=18446744073709551615, options=0, result=0x7fff98008fd8) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_delete.cc:494
|
but after that tries to execute it from here:
#0 JOIN::exec (this=0x7fff980091d0) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_select.cc:2987
|
#1 0x0000555555cc0341 in subselect_single_select_engine::exec (this=0x7fff98008f70) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/item_subselect.cc:3695
|
#2 0x0000555555cb7097 in Item_subselect::exec (this=0x7fff98008e30) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/item_subselect.cc:668
|
#3 0x0000555555cb894d in Item_singlerow_subselect::val_int (this=0x7fff98008e30) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/item_subselect.cc:1263
|
#4 0x0000555555c38903 in Item::send (this=0x7fff98008e30, protocol=0x555557fd7ee0, buffer=0x7ffff4640a20) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/item.cc:6404
|
#5 0x0000555555939362 in Protocol::send_result_set_row (this=0x555557fd7ee0, row_items=0x555557fdbed8) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/protocol.cc:906
|
#6 0x00005555559aec51 in select_send::send_data (this=0x7fff98008fd8, items=...) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_class.cc:2771
|
#7 0x0000555555d7b57e in mysql_delete (thd=0x555557fd7950, table_list=0x7fff98007528, conds=0x0, order_list=0x555557fdbff8, limit=18446744073709551615, options=0, result=0x7fff98008fd8) at /home/psergey/dev-git/10.1-mdev8646.fly/sql/sql_delete.cc:562
|
I suppose, before MDEV-8646 code, the subquery was able to survive cleanups?
Pushed a (very simple) fix for analyze*.test and json.test. List of failing tests now:
There are now 25 tests failing:
main.innodb_ext_key
|
main.information_schema
|
main.union
|
main.show_explain
|
main.openssl_6975
|
main.myisam
|
main.show_explain_non_select
|
main.system_mysql_db_fix40123
|
main.system_mysql_db_fix50030
|
main.system_mysql_db_fix50117
|
main.limit
|
main.limit_rows_examined
|
main.func_group
|
main.select_found
|
main.user_var
|
main.delete_returning
|
main.distinct
|
main.events_1
|
main.derived_view
|
Test failures yesterday:
<igor3> main.stat_tables_par_innodb
|
main.show_explain
|
main.show_explain_non_select
|
main.system_mysql_db_fix40123
|
main.system_mysql_db_fix50030
|
main.system_mysql_db_fix50117
|
main.union
|
main.delete_returning
|
main.events_1
|
Test failures right now: "Failed 12/882 tests",
main.show_explain
|
main.show_explain_non_select
|
main.system_mysql_db_fix40123
|
main.system_mysql_db_fix50030
|
main.system_mysql_db_fix50117
|
main.openssl_6975
|
main.events_1
|
Pushed a fix for SHOW EXPLAIN tests. Now:
Completed: Failed 8/881 tests, 99.09% were successful.
|
Failing test(s):
|
main.system_mysql_db_fix40123
|
main.system_mysql_db_fix50030
|
main.system_mysql_db_fix50117
|
main.openssl_6975
|
main.events_1
|
That is, all remaining test failures are test failures that were present in the 10.1 tree that this tree was based on.
Today I pushed consolidated patch for this task into the 10.2-mdev8646 tree.
Observations from looking at the buildbot
http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.2-mdev8646
- show_explain failed on windows. the failure is not always repoducible
- maria.maria fails due to missing "Distinct" in EXPLAIN output (1), (2), other
builds
- Some mroonga tests fail on on sol10-64, not sure what the cause is. (1)
- sequence.group_by Fails because we didn't merge "Query pushdown" code
correctly. EXPLAIN changes from "Storage engine handles GROUP BY" to a
regular plan. (Q: what does EXPLAIN FORMAT=JSON show for such queries?) (1),
(2), other builds
(1) http://buildbot.askmonty.org/buildbot/builders/sol10-64/builds/4747/steps/test/logs/stdio
(2) http://buildbot.askmonty.org/buildbot/builders/labrador/builds/6605/steps/test/logs/stdio
Grepping in the current code for 'original_join_tab', I get:
sql_select.h|1179| JOIN_TAB *original_join_tab;
|
sql_select.h|1475| original_join_tab= 0;
|
sql_select.cc|11719| if (original_join_tab)
|
sql_select.cc|11722| join_tab= original_join_tab;
|
sql_select.cc|11723| original_join_tab= 0;
|
This doesn't make any sense (original_join_tab==NULL always, what's the point?)
It did make sense before the merge, because there was also this piece of code:
/* Remember information about the original join */
|
original_join_tab= join_tab;
|
original_table_count= table_count;
|
|
/* Set up one join tab to get sorting to work */
|
const_tables= 0;
|
table_count= 1;
|
join_tab= (JOIN_TAB*) thd->calloc(sizeof(JOIN_TAB));
|
join_tab[0].table= exec_tmp_table1;
|
We probably don't need this "JOIN smashing" in the post-MDEV-8626 codebase, but then we don't need JOIN::original_join_tab, either.
I get this warning:
|| /home/psergey/dev-git/10.2-mdev8646-igor/sql/sql_select.cc: In member function ‘bool JOIN::make_aggr_tables_info()’:
|
sql_select.cc|2085 col 8| warning: variable ‘materialize_join’ set but not used [-Wunused-but-set-variable]
|
can materialize_join be removed?
Need to also figure out JOIN::do_select_call_count.
In 10.2-main, JOIN::exec_inner() can make multiple do_select calls for one query execution, and it also sets curr_join->do_select_call_count= 0; at start.
After the merge
- JOIN::exec_inner has only one invocation of do_select().
- We don't reset do_select_call_count when starting join execution.
This is wrong.
/*
|
For "Using temporary+Using filesort" queries, JOIN::join_tab can point to
|
either:
|
1. array of join tabs describing how to run the select, or
|
2. array of single join tab describing read from the temporary table.
|
|
SHOW EXPLAIN code needs to read/show #1. This is why two next members are
|
there for saving it.
|
*/
|
JOIN_TAB *table_access_tabs;
|
uint top_table_access_tabs_count;
|
These members in class JOIN are probably not needed anymore
GROUP BY pushdown code requires that a temporary table is created (although with do_not_open=1) even if the query itself doesn't need it.
Example:
SELECT SUM(1) FROM seq_1_to_10;
|
This crashes if group_by_handler::table is not set.
Pushed a fix for maria.maria.
Buildbot shows that valgrind tests crash on start:
http://buildbot.askmonty.org/buildbot/builders/work-amd64-valgrind/builds/8348
sql/sql_base.cc:7907 is
DBUG_PRINT("enter", ("ref_pointer_array: %p", ref_pointer_array));
|
should be easy to fix
A remainder of the old code:
static bool
|
make_group_fields(JOIN *main_join, JOIN *curr_join)
|
The code was pushed into bb-10.1-mdev8646 branch. The branch is based on 10.1, although the MDEV is targeted at 10.2