[MDEV-5161] Wrong result (missing rows) with semijoin, LEFT JOIN, ORDER BY, constant table - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.0.4, 5.3.12, 5.5.33a
Fix Version/s: 5.5.35, 10.0.7
Component/s: None
Labels:
None

Description

The test case is nearly identical to the one provided in the comment to ~~MDEV-5059~~ (https://mariadb.atlassian.net/browse/MDEV-5059?focusedCommentId=35418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-35418); the only difference is that table t3 lost the index, and the wrong result now affects the default join_cache_level=2.

Test case:

SET optimizer_switch = 'semijoin=on';

CREATE TABLE t1 (pk INT PRIMARY KEY, c1 VARCHAR(1)) ENGINE=MyISAM;

INSERT INTO t1 VALUES (1,'v'),(2,'v'),(3,'c'),(4,NULL),(5,'x');

CREATE TABLE t2 (c2 VARCHAR(1)) ENGINE=MyISAM;

INSERT INTO t2 VALUES ('x');

CREATE TABLE t3 (c3 VARCHAR(1)) ENGINE=MyISAM;

INSERT INTO t3 VALUES ('x'),('d');

SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;

Actual result:

pk	c1	c2

4	NULL	x

Expected result:

pk	c1	c2

4	NULL	x

3	c	x

1	v	x

2	v	x

5	x	x

revision-id: bar@mnogosearch.org-20131016141313-63jt6geakz1e6dg2

revno: 3708

branch-nick: 5.3

EXPLAIN EXTENDED

SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra

1	PRIMARY	t2	system	NULL	NULL	NULL	NULL	1	100.00	Using filesort

1	PRIMARY	t1	ALL	PRIMARY	NULL	NULL	NULL	5	100.00

1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	test.t1.pk	1	100.00

1	PRIMARY	t3	ALL	NULL	NULL	NULL	NULL	2	100.00	Using where; Start temporary; End temporary

Warnings:

Note	1003	select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`c1` AS `c1`,'x' AS `c2` from `test`.`t1` semi join (`test`.`t3`) join `test`.`t1` join `test`.`t2` where (`test`.`t1`.`pk` = `test`.`t1`.`pk`) order by 'x',`test`.`t1`.`c1`

Attachments

Issue Links

is blocked by

MDEV-5293 outer join, join buffering, and order by - invalid query plan

Closed

relates to

MDEV-5059 Wrong result (missing row) wih semijoin, join_cache_level > 2, LEFT JOIN, ORDER BY

Closed

Activity

Ascending order - Click to sort in descending order

Elena Stepanova created issue - 2013-10-20 01:01

Elena Stepanova made changes - 2013-10-20 01:01

Field	Original Value	New Value
Link		This issue relates to TODO-60 [ TODO-60 ]

Sergei Petrunia made changes - 2013-11-13 17:33

Status

Open [ 1 ]

In Progress [ 3 ]

Sergei Petrunia added a comment - 2013-11-13 17:37

FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) )

EXPLAIN EXTENDED shows:

... from `j2`.`t1` semi join (`j2`.`t3`) join `j2`.`t1` where (`j2`.`t1`.`pk` = `j2`.`t1`.`pk`)

note that

t1 is pulled out of the semi-join,
t3 is not pulled out of the semi-join.
the LEFT JOIN from "t1 LEFT JOIN t3" is gone.

Sergei Petrunia added a comment - 2013-11-13 17:37 FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) EXPLAIN EXTENDED shows: ... from `j2`.`t1` semi join (`j2`.`t3`) join `j2`.`t1` where (`j2`.`t1`.`pk` = `j2`.`t1`.`pk`) note that t1 is pulled out of the semi-join, t3 is not pulled out of the semi-join. the LEFT JOIN from "t1 LEFT JOIN t3" is gone.

Sergei Petrunia added a comment - 2013-11-13 17:47

Another interesting observation is that table elimination could theoretically remove the LEFT JOIN from the subquery, but that didn't happen.

Sergei Petrunia added a comment - 2013-11-13 17:47 Another interesting observation is that table elimination could theoretically remove the LEFT JOIN from the subquery, but that didn't happen.

Sergei Petrunia added a comment - 2013-11-13 19:37

If I disallow pullout of parts of outer joins out of semi-joins, I get EXPLAIN like this:

+EXPLAIN EXTENDED
+SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 PRIMARY t2 system NULL NULL NULL NULL 1 100.00 Using filesort
+1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 5 100.00
+1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 test.t1.pk 1 100.00 Start temporary
+1 PRIMARY t3 ALL NULL NULL NULL NULL 2 100.00 Using where; End temporary
+Warnings:
+Note 1003 select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`c1` AS `c1`,'x' AS `c2` from `test`.`t1` semi join (`test`.`t1` left join `test`.`t3` on((`test`.`t1`.`c1` = `test`.`t3`.`c3`))) where (`test`.`t1`.`pk` = `test`.`t1`.`pk`) order by 'x',`test`.`t1`.`c1`

The warning text shows that the LEFT JOIN is here. However, the query result is still incorrect.

Sergei Petrunia added a comment - 2013-11-13 19:37 If I disallow pullout of parts of outer joins out of semi-joins, I get EXPLAIN like this: +EXPLAIN EXTENDED +SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1; +id select_type table type possible_keys key key_len ref rows filtered Extra +1 PRIMARY t2 system NULL NULL NULL NULL 1 100.00 Using filesort +1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 5 100.00 +1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 test.t1.pk 1 100.00 Start temporary +1 PRIMARY t3 ALL NULL NULL NULL NULL 2 100.00 Using where; End temporary +Warnings: +Note 1003 select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`c1` AS `c1`,'x' AS `c2` from `test`.`t1` semi join (`test`.`t1` left join `test`.`t3` on((`test`.`t1`.`c1` = `test`.`t3`.`c3`))) where (`test`.`t1`.`pk` = `test`.`t1`.`pk`) order by 'x',`test`.`t1`.`c1` The warning text shows that the LEFT JOIN is here. However, the query result is still incorrect.

Sergei Petrunia added a comment - 2013-11-13 21:14

Tracing this EXPLAIN

MariaDB [j2]> explain SELECT * FROM t1a, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
----------------------------------------------------------------------------------------------------------------------------------+

select_type

table

type

possible_keys

key

key_len

ref

rows

Extra

----------------------------------------------------------------------------------------------------------------------------------+

1	PRIMARY	t2	system	NULL	NULL	NULL	NULL	1	Using filesort
1	PRIMARY	t1a	ALL	PRIMARY	NULL	NULL	NULL	5
1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	j2.t1a.pk	1	Start temporary
1	PRIMARY	t3	ALL	NULL	NULL	NULL	NULL	2	Using where; End temporary; Using join buffer (flat, BNL join)

----------------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

it seems that SJ_TMP_TABLE::sj_weedout_check_row() is invoked correclty for
table t3. However, the check doesn't work: it checks rowid of table t1a, which
always has the same value.

Sergei Petrunia added a comment - 2013-11-13 21:14 Tracing this EXPLAIN MariaDB [j2] > explain SELECT * FROM t1a, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1; ----- ----------- ----- ------ ------------- ------- ------- --------- ---- ---------------------------------------------------------------+ id select_type table type possible_keys key key_len ref rows Extra ----- ----------- ----- ------ ------------- ------- ------- --------- ---- ---------------------------------------------------------------+ 1 PRIMARY t2 system NULL NULL NULL NULL 1 Using filesort 1 PRIMARY t1a ALL PRIMARY NULL NULL NULL 5 1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 j2.t1a.pk 1 Start temporary 1 PRIMARY t3 ALL NULL NULL NULL NULL 2 Using where; End temporary; Using join buffer (flat, BNL join) ----- ----------- ----- ------ ------------- ------- ------- --------- ---- ---------------------------------------------------------------+ 4 rows in set (0.00 sec) it seems that SJ_TMP_TABLE::sj_weedout_check_row() is invoked correclty for table t3. However, the check doesn't work: it checks rowid of table t1a, which always has the same value.

Sergei Petrunia added a comment - 2013-11-13 21:31

The issue doesn't depend on join buffering. I can get a wrong result when it is not used, too:
set optimizer_switch='materialization=off,mrr=on,mrr_sort_keys=on,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off';
set join_cache_level=1;

MariaDB [j4]> explain SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
---------------------------------------------------------------------------------------------+

select_type

table

type

possible_keys

key

key_len

ref

rows

Extra

---------------------------------------------------------------------------------------------+

1	PRIMARY	t2	system	NULL	NULL	NULL	NULL	1	Using filesort
1	PRIMARY	t1	ALL	PRIMARY	NULL	NULL	NULL	5
1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	j4.t1.pk	1	Start temporary
1	PRIMARY	t3	ALL	NULL	NULL	NULL	NULL	2	Using where; End temporary

---------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

MariaDB [j4]> SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
------------

------------

NULL

------------
1 row in set (0.00 sec)

Sergei Petrunia added a comment - 2013-11-13 21:31 The issue doesn't depend on join buffering. I can get a wrong result when it is not used, too: set optimizer_switch='materialization=off,mrr=on,mrr_sort_keys=on,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off'; set join_cache_level=1; MariaDB [j4] > explain SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1; ----- ----------- ----- ------ ------------- ------- ------- -------- ---- ---------------------------+ id select_type table type possible_keys key key_len ref rows Extra ----- ----------- ----- ------ ------------- ------- ------- -------- ---- ---------------------------+ 1 PRIMARY t2 system NULL NULL NULL NULL 1 Using filesort 1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 5 1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 j4.t1.pk 1 Start temporary 1 PRIMARY t3 ALL NULL NULL NULL NULL 2 Using where; End temporary ----- ----------- ----- ------ ------------- ------- ------- -------- ---- ---------------------------+ 4 rows in set (0.00 sec) MariaDB [j4] > SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1; --- ---- ----- pk c1 c2 --- ---- ----- 4 NULL x --- ---- ----- 1 row in set (0.00 sec)

Sergei Petrunia added a comment - 2013-11-14 01:14

Without join buffering, the problem is still the same: SJ_TMP_TABLE::sj_weedout_check_row() checks rowid of table t1a (the outer t1), however, that table always has:

(gdb) p h->ref_length
$101 = 6
(gdb) x/6x h->ref
0x7fffb4c584b0: 0xff 0xff 0xff 0xff 0xff 0xff

and that's why the query produces only one row instead of five.
Note that the SJ-Weedout table is not degenerate:
(gdb) p is_degenerate
$102 = false

Sergei Petrunia added a comment - 2013-11-14 01:14 Without join buffering, the problem is still the same: SJ_TMP_TABLE::sj_weedout_check_row() checks rowid of table t1a (the outer t1), however, that table always has: (gdb) p h->ref_length $101 = 6 (gdb) x/6x h->ref 0x7fffb4c584b0: 0xff 0xff 0xff 0xff 0xff 0xff and that's why the query produces only one row instead of five. Note that the SJ-Weedout table is not degenerate: (gdb) p is_degenerate $102 = false

Sergei Petrunia added a comment - 2013-11-14 01:30

Ok, I think I get it. The problem seems to be caused by doing filesort on the first table. After that, join execution reads table rows with rr_unpack_from_buffer(). However, that function doesn't update the rowid at join->join_tab[1].table->ref.

Duplicate Weedout code sees the same rowid all the time, and concludes that all row combinations except for the first one are duplicates.

Sergei Petrunia added a comment - 2013-11-14 01:30 Ok, I think I get it. The problem seems to be caused by doing filesort on the first table. After that, join execution reads table rows with rr_unpack_from_buffer(). However, that function doesn't update the rowid at join->join_tab [1] .table->ref. Duplicate Weedout code sees the same rowid all the time, and concludes that all row combinations except for the first one are duplicates.

Sergei Petrunia added a comment - 2013-11-14 20:26

Two thoughts:

1. If the first table is doing a post-create_sort_index() read, we need to
provide either rowids of rows that are read, or some substitute (e.g. a counter)

2. Duplicate Elimination and create_sort_index.

2.1 Suppose there is no join buffering.
Then, there should be no reason to include first table's rowid in Duplicate
Elimination's weedout table. The first table is part of the prefix of outer
tables and should not be included.

2.2 Suppose, join buffering is used somewhere within the duplicate elimination
range. Then, all tables, including the very first one, should be included in
the Duplicate Elimination Weedout table.
However, join buffering destroys ordering. Why would we need to run
create_sort_index to produce ordered resultset but after that use join
buffering to destroy it?

Sergei Petrunia added a comment - 2013-11-14 20:26 Two thoughts: 1. If the first table is doing a post-create_sort_index() read, we need to provide either rowids of rows that are read, or some substitute (e.g. a counter) 2. Duplicate Elimination and create_sort_index. 2.1 Suppose there is no join buffering. Then, there should be no reason to include first table's rowid in Duplicate Elimination's weedout table. The first table is part of the prefix of outer tables and should not be included. 2.2 Suppose, join buffering is used somewhere within the duplicate elimination range. Then, all tables, including the very first one, should be included in the Duplicate Elimination Weedout table. However, join buffering destroys ordering. Why would we need to run create_sort_index to produce ordered resultset but after that use join buffering to destroy it?

Sergei Petrunia added a comment - 2013-11-14 20:59

Exploring thought#2.

When I try to debug the testcase that is not using the join buffering, I see
that setup_semijoin_dups_elimination()

concludes that join buffering will be used (and so assigns
first_table= join->const_tables)
calls join->get_sort_by_join_tab() which returns false, from where it
concludes that no sorting of first table will be done.

Sergei Petrunia added a comment - 2013-11-14 20:59 Exploring thought#2. When I try to debug the testcase that is not using the join buffering, I see that setup_semijoin_dups_elimination() concludes that join buffering will be used (and so assigns first_table= join->const_tables) calls join->get_sort_by_join_tab() which returns false, from where it concludes that no sorting of first table will be done.

Sergei Petrunia added a comment - 2013-11-14 22:25

The case where no join buffering is used could be solved with this approach:
After the optimizer has made final decisions about whether to use join buffering,
walk through the plans to do Duplicate Elimination and move the JOIN_TAB that flushes the temporary to be as late as possible.

Sergei Petrunia added a comment - 2013-11-14 22:25 The case where no join buffering is used could be solved with this approach: After the optimizer has made final decisions about whether to use join buffering, walk through the plans to do Duplicate Elimination and move the JOIN_TAB that flushes the temporary to be as late as possible.

Sergei Petrunia added a comment - 2013-11-14 22:28

What is not clear is what should we do when join buffering is used.

Look at the EXPLAIN in the comment for this bug here:
https://mariadb.atlassian.net/browse/MDEV-5161?focusedCommentId=35977&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-35977

It has both join buffering and "Using filesort" (which means: run filesort() for the first table, then do join with other tables). "Using join buffer" will break the ordering that was produced by filesort. Apparently, the bug is not [only] in Duplicate Elimination. join buffer and order by optimizations also work incorrectly.

Sergei Petrunia added a comment - 2013-11-14 22:28 What is not clear is what should we do when join buffering is used. Look at the EXPLAIN in the comment for this bug here: https://mariadb.atlassian.net/browse/MDEV-5161?focusedCommentId=35977&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-35977 It has both join buffering and "Using filesort" (which means: run filesort() for the first table, then do join with other tables). "Using join buffer" will break the ordering that was produced by filesort. Apparently, the bug is not [only] in Duplicate Elimination. join buffer and order by optimizations also work incorrectly.

Sergei Petrunia added a comment - 2013-11-15 00:50

Created a testcase for the problem mentioned in the previous comment. Filed it as ~~MDEV-5293~~.

Sergei Petrunia added a comment - 2013-11-15 00:50 Created a testcase for the problem mentioned in the previous comment. Filed it as MDEV-5293 .

Sergei Petrunia made changes - 2013-11-15 00:51

Link

This issue relates to ~~MDEV-5293~~ [ ~~MDEV-5293~~ ]

Sergei Golubchik made changes - 2013-11-15 18:14

Link

This issue is blocked by ~~MDEV-5293~~ [ ~~MDEV-5293~~ ]

Sergei Golubchik made changes - 2013-11-15 18:14

Link

This issue relates to ~~MDEV-5293~~ [ ~~MDEV-5293~~ ]

Sergei Golubchik made changes - 2013-11-15 18:16

Fix Version/s		5.5.35 [ 14000 ]
Fix Version/s	5.3.13 [ 12602 ]
Fix Version/s	5.5.34 [ 13700 ]

Sergei Petrunia added a comment - 2013-11-18 16:28

Now I have a patch for ~~MDEV-5293~~, continuing to work on this bug.

Sergei Petrunia added a comment - 2013-11-18 16:28 Now I have a patch for MDEV-5293 , continuing to work on this bug.

Sergei Petrunia added a comment - 2013-11-18 16:30

Interesting.. there is code in setup_semijoin_dups_elimination() which disables "Using filesort":

Fixed LP bug #901478.
If the duplicate elimination strategy is used for a semi-join and potentially
one of the block-based join algorithms can be employed to join the inner
tables of the semi-join then sorting of the head (first non-constant) table
for a query with ORDER BY / GROUP BY cannot be used.

It seems, a problem similar to this one was encountered before?

Sergei Petrunia added a comment - 2013-11-18 16:30 Interesting.. there is code in setup_semijoin_dups_elimination() which disables "Using filesort": Fixed LP bug #901478. If the duplicate elimination strategy is used for a semi-join and potentially one of the block-based join algorithms can be employed to join the inner tables of the semi-join then sorting of the head (first non-constant) table for a query with ORDER BY / GROUP BY cannot be used. It seems, a problem similar to this one was encountered before?

Sergei Petrunia added a comment - 2013-11-18 16:37

The fix looks like this:

Make sure that possible sorting of rows from the head table
is not to be employed.
*/
if (join->get_sort_by_join_tab())

{ join->simple_order= 0; join->simple_group= 0; join->need_tmp= join->test_if_need_tmp_table(); }

and it doesn't work for our testcase, because join->get_sort_by_join_tab()=NULL, which is caused by JOIN::sort_by_table being NULL.

Sergei Petrunia added a comment - 2013-11-18 16:37 The fix looks like this: Make sure that possible sorting of rows from the head table is not to be employed. */ if (join->get_sort_by_join_tab()) { join->simple_order= 0; join->simple_group= 0; join->need_tmp= join->test_if_need_tmp_table(); } and it doesn't work for our testcase, because join->get_sort_by_join_tab()=NULL, which is caused by JOIN::sort_by_table being NULL.

Sergei Golubchik made changes - 2013-11-18 21:52

Fix Version/s		10.0.7 [ 14100 ]
Fix Version/s	10.0.6 [ 13202 ]

Sergei Petrunia added a comment - 2013-11-19 14:34

I've made a fix that runs get_sort_by_table() after const. tables have been detected and read. This caused get_sort_by_join_tab() to return the first JOIN_TAB in this example.

Tests pass, including valgrind.

Sergei Petrunia added a comment - 2013-11-19 14:34 I've made a fix that runs get_sort_by_table() after const. tables have been detected and read. This caused get_sort_by_join_tab() to return the first JOIN_TAB in this example. Tests pass, including valgrind.

Sergei Petrunia made changes - 2013-11-21 13:32

Resolution		Fixed [ 1 ]
Status	In Progress [ 3 ]	Closed [ 6 ]

Sergei Petrunia made changes - 2013-11-21 15:44

Link

This issue relates to ~~MDEV-5059~~ [ ~~MDEV-5059~~ ]

Daniel Bartholomew added a comment - 2014-01-29 19:38

http://bazaar.launchpad.net/~maria-captains/maria/5.5/revision/3963.1.2

Daniel Bartholomew added a comment - 2014-01-29 19:38 http://bazaar.launchpad.net/~maria-captains/maria/5.5/revision/3963.1.2

Sergei Golubchik made changes - 2014-06-13 15:06

Workflow

defaullt [ 29331 ]

MariaDB v2 [ 43556 ]

Rasmus Johansson (Inactive) made changes - 2015-05-18 17:51

Workflow

MariaDB v2 [ 43556 ]

MariaDB v3 [ 62690 ]

Sergei Golubchik made changes - 2021-12-06 21:39

Workflow

MariaDB v3 [ 62690 ]

MariaDB v4 [ 147145 ]

People

Assignee:: Sergei Petrunia

Reporter:: Elena Stepanova

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2013-10-20 01:01

Updated:: 2014-01-29 19:38

Resolved:: 2013-11-21 13:32

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration