[MDEV-5293] outer join, join buffering, and order by - invalid query plan Created: 2013-11-15  Updated: 2014-01-29  Resolved: 2013-11-21

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.5, 5.3.12, 5.5.33a
Fix Version/s: 5.5.35, 10.0.7, 5.3.13

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-5161 Wrong result (missing rows) with semi... Closed
Relates
relates to MDEV-5059 Wrong result (missing row) wih semijo... Closed

 Description   

A combination of outer join, join buffering, and order by causes invalid query plans to be generated.

Test dataset:

create table t0 (a int primary key) engine=myisam;
insert into t0 values (1);
 
create table t1(a int) engine=myisam;
insert into t1 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
alter table t1 add b int;
 
create table t2 like t1;
insert into t2 select * from t1;

Query and EXPLAIN:

explain select * from t0,t1 left join t2 on t1.b=t2.b order by t0.a, t1.a;
 
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	t0	system	NULL	NULL	NULL	NULL	1	Using filesort
1	SIMPLE	t1	ALL	NULL	NULL	NULL	NULL	10	
1	SIMPLE	t2	ALL	NULL	NULL	NULL	NULL	10	Using where; Using join buffer (flat, BNL join)

The query plan uses both "Using filesort" and "Using join buffer". It will produce the results in the wrong order.

I'll elaborate why:

  • The query has ORDER BY and so must produce rows in order.
    "Using filesort" means that the first table (t1 in the example) is read by filesort(), which produces the sorted sequence, which is then joined to table t2.
  • However, table t2 uses join buffer, which will break the ordering created by filesort.


 Comments   
Comment by Sergei Petrunia [ 2013-11-15 ]

Analysis:

The optimizer protects itself from getting "Using join buffer" together with
"Using filesort" by a convoluted mess of logic in various locations.

General approach is:

  • The choice whether to use join buffering does not depend on ORDER/GROUP BY.
  • On the other hand, ORDER/GROUP BY code checks whether join buffering is
    needed. If yes, it will not use "Using filesort".

== Piece #1 ==

At the end of make_join_readinfo() there is a piece of code that starts
with this comment:

If a join buffer is used to join a table the ordering by an index
for the first non-constant table cannot be employed anymore.

The code there changes "filesort on the first table" plan into a
"Using filesort; Using temporary" plan.

For our example, the change will not be done, because sort_by_tab==NULL,
which is caused by join->get_sort_by_join_tab()==NULL (which I think is
incorrect).

join->get_sort_by_join_tab()==NULL, because JOIN::sort_by_table==NULL. It is
set by this call in JOIN::optimize:

sort_by_table= get_sort_by_table(order, group_list, select_lex->leaf_tables);

Locally, it makes sense - when get_sort_by_table() is called, it sees
"ORDER BY t0.a, t1.a", the optimizer doesn't yet know that t0.a is a constant,
hence sort_by_table=NULL.

== Piece #2 ==
JOIN::optimize also has this piece of code:

// Can't use sort on head table if using join buffering
if (full_join || hash_join)

{ ... }

It switches to using "Using temporary; Using filesort" when
JOIN::full_join=TRUE.

JOIN::full_join is set by get_best_combination(), it will receive a value of
true if the join plan has a table that has:

join->best_positions[tablenr].use_join_buffer==true

(the actual condition is more complex, but that's the idea).

In turn, the value of POSITION::use_join_buffer comes from best_access_path().
The code in best_access_path has:

best_uses_jbuf= test(!disable_jbuf && !((s->table->map &
join->outer_join)));

...
pos->use_join_buffer= best_uses_jbuf;

Note that best_uses_jbuf==FALSE when the table is an inner table of outer join.
(I guess, this is old code. Before BKA was introduced, join buffering couldn't
be used with outer joins. This is no longer the case, but the old logic is
still there)

Anyhow, if this example didn't have LEFT JOIN, Piece#2 would have taken care of
switching to "Using temporary; Using filesort". Bot with left join, it didn't.

Comment by Sergei Petrunia [ 2013-11-18 ]

Committed a patch.

Comment by Daniel Bartholomew [ 2014-01-29 ]

http://bazaar.launchpad.net/~maria-captains/maria/5.5/revision/3963.1.1

Generated at Thu Feb 08 07:03:09 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.