[MDEV-6706] Wrong result (missing rows) with joins, SQ, ORDER BY, semijoin=on - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 5.3.12, 5.5.39, 10.0.13, 10.1.30
Fix Version/s: 5.5.59
Component/s: Optimizer
Labels:
None

Sprint:
5.5.59

Description

Test case:

CREATE TABLE t1 (f1 VARCHAR(3)) ENGINE=MyISAM;

INSERT INTO t1 VALUES ('foo');

CREATE TABLE t2 (f2 VARCHAR(3)) ENGINE=MyISAM;

INSERT INTO t2 VALUES ('bar'),('baz');

CREATE TABLE t3 (i3_key INT, f3_key VARCHAR(3), f3 VARCHAR(3), KEY(f3_key,i3_key)) ENGINE=MyISAM;

INSERT INTO t3 VALUES (0,'qux','qux'),(8,'bar','bar');

SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY (

  SELECT f1

  FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 )

  WHERE t3a.f3 < f1 OR t3b.f3 != f1

ORDER BY field;

Actual result:

field

foobar

Result without ORDER BY or with semijoin=off:

field

foobar

foobaz

The problem appeared in 5.3 tree with the following revision:

revno: 3604 [merge]

revision-id: igor@askmonty.org-20121122183039-dm575rqkw40558qp

parent: sergii@pisem.net-20121117155015-4ab41ncach4iavao

parent: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o

committer: Igor Babaev <igor@askmonty.org>

branch nick: maria-5.3

timestamp: Thu 2012-11-22 10:30:39 -0800

message:

  Merge

    ------------------------------------------------------------

    revno: 3602.1.1

    revision-id: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o

    parent: knielsen@knielsen-hq.org-20121120125749-l55gvliervi5nnaf

    committer: Igor Babaev <igor@askmonty.org>

    branch nick: maria-5.3-mdev645

    timestamp: Wed 2012-11-21 21:55:04 -0800

    message:

      Fixed LP bug #1002146 (bug mdev-645).

      If the setting of system variables does not allow to use join buffer

      for a join query with GROUP BY <f1,...> / ORDER BY <f1,...> then

      filesort is not needed if the first joined table is scanned in

      the order compatible with order specified by the list <f1,...>.

Attachments

Activity

Ascending order - Click to sort in descending order

Elena Stepanova created issue - 2014-09-06 12:10

Elena Stepanova made changes - 2014-09-06 12:11

Field	Original Value	New Value
Description	Test case: {code:sql} CREATE TABLE t1 (f1 VARCHAR(3)) ENGINE=MyISAM; INSERT INTO t1 VALUES ('foo'); CREATE TABLE t2 (f2 VARCHAR(3)) ENGINE=MyISAM; INSERT INTO t2 VALUES ('bar'),('baz'); CREATE TABLE t3 (i3_key INT, f3_key VARCHAR(3), f3 VARCHAR(3), KEY(f3_key,i3_key)) ENGINE=MyISAM; INSERT INTO t3 VALUES (0,'qux','qux'),(8,'bar','bar'); SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY ( SELECT f1 FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 ) WHERE t3a.f3 < f1 OR t3b.f3 != f1 ) ORDER BY field; {code} Actual result: {code:sql} SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY ( SELECT f1 FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 ) WHERE t3a.f3 < f1 OR t3b.f3 != f1 ) ORDER BY field; field foobar {code} Result without ORDER BY: {code:sql} SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY ( SELECT f1 FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 ) WHERE t3a.f3 < f1 OR t3b.f3 != f1 ); field foobar foobaz {code} The problem appeared in 5.3 tree with the following revision: {noformat} revno: 3604 [merge] revision-id: igor@askmonty.org-20121122183039-dm575rqkw40558qp parent: sergii@pisem.net-20121117155015-4ab41ncach4iavao parent: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o committer: Igor Babaev <igor@askmonty.org> branch nick: maria-5.3 timestamp: Thu 2012-11-22 10:30:39 -0800 message: Merge ------------------------------------------------------------ revno: 3602.1.1 revision-id: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o parent: knielsen@knielsen-hq.org-20121120125749-l55gvliervi5nnaf committer: Igor Babaev <igor@askmonty.org> branch nick: maria-5.3-mdev645 timestamp: Wed 2012-11-21 21:55:04 -0800 message: Fixed LP bug #1002146 (bug mdev-645). If the setting of system variables does not allow to use join buffer for a join query with GROUP BY <f1,...> / ORDER BY <f1,...> then filesort is not needed if the first joined table is scanned in the order compatible with order specified by the list <f1,...>. {noformat}	Test case: {code:sql} CREATE TABLE t1 (f1 VARCHAR(3)) ENGINE=MyISAM; INSERT INTO t1 VALUES ('foo'); CREATE TABLE t2 (f2 VARCHAR(3)) ENGINE=MyISAM; INSERT INTO t2 VALUES ('bar'),('baz'); CREATE TABLE t3 (i3_key INT, f3_key VARCHAR(3), f3 VARCHAR(3), KEY(f3_key,i3_key)) ENGINE=MyISAM; INSERT INTO t3 VALUES (0,'qux','qux'),(8,'bar','bar'); SELECT CONCAT( f1, f2 ) AS field FROM t1, t2 WHERE f1 = ANY ( SELECT f1 FROM t1 LEFT JOIN ( t3 AS t3a, t3 AS t3b ) ON ( t3b.f3_key = t3a.f3 ) WHERE t3a.f3 < f1 OR t3b.f3 != f1 ) ORDER BY field; {code} Actual result: {code:sql} field foobar {code} Result without {{ORDER BY}} or with {{semijoin=off}}: {code:sql} field foobar foobaz {code} The problem appeared in 5.3 tree with the following revision: {noformat} revno: 3604 [merge] revision-id: igor@askmonty.org-20121122183039-dm575rqkw40558qp parent: sergii@pisem.net-20121117155015-4ab41ncach4iavao parent: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o committer: Igor Babaev <igor@askmonty.org> branch nick: maria-5.3 timestamp: Thu 2012-11-22 10:30:39 -0800 message: Merge ------------------------------------------------------------ revno: 3602.1.1 revision-id: igor@askmonty.org-20121122055504-vdyz74a9d7n48x0o parent: knielsen@knielsen-hq.org-20121120125749-l55gvliervi5nnaf committer: Igor Babaev <igor@askmonty.org> branch nick: maria-5.3-mdev645 timestamp: Wed 2012-11-21 21:55:04 -0800 message: Fixed LP bug #1002146 (bug mdev-645). If the setting of system variables does not allow to use join buffer for a join query with GROUP BY <f1,...> / ORDER BY <f1,...> then filesort is not needed if the first joined table is scanned in the order compatible with order specified by the list <f1,...>. {noformat}

Rasmus Johansson (Inactive) made changes - 2015-05-18 17:51

Workflow

MariaDB v2 [ 53738 ]

MariaDB v3 [ 64799 ]

Igor Babaev (Inactive) made changes - 2017-05-14 17:36

Assignee

Sergei Petrunia [ psergey ]

Igor Babaev [ igor ]

Igor Babaev (Inactive) made changes - 2018-01-09 17:45

Status

Open [ 1 ]

In Progress [ 3 ]

Sergei Golubchik made changes - 2018-01-10 10:15

Sprint

5.5.59 [ 221 ]

Igor Babaev (Inactive) added a comment - 2018-01-12 06:26 - edited

The code of get_sort_by_table() that checks that ordering can be done using fields of only one table

  while ((table= ti++) && !(map & table->table->map)) ;

  if (map != table->table->map)

    DBUG_RETURN(0);                             // More than one table

was incorrect. The code uses the previously calculated value of 'map' containing table_map bits for all tables the order by items depend on. It includes bits for all const (single_row) tables that should be excluded before the the loop in the above code starts.

  map&= ~const_tables;

  while ((table= ti++) && !(map & table->table->map)) ;

  if (map != table->table->map)

    DBUG_RETURN(0);                             // More than one table

The function get_sort_by_table() is called in make_join_statistics() to set the value of JOIN::sort_by_table.
Later this value can be used by the function setup_semijoin_dups_elimination() in the following code. (The function JOIN::JOIN_TAB *get_sort_by_join_tab() returns 0 if sort_by_table set to 0.)

/*

              Make sure that possible sorting of rows from the head table

              is not to be employed.

*/

            if (join->get_sort_by_join_tab())

              join->simple_order= 0;

              join->simple_group= 0;

              join->need_tmp= join->test_if_need_tmp_table();

This code is executed for the reported query. The code prohibits using sorting by the first non constant joined table and, as a result, it requires creation of a temporary table containing all rows from the result set. Sorting the records from the temporary table produces the necessary ordering.

Let's see what's going on when executing the reported query with the current code. The call of the function get_sort_by_table() set JOIN::sort_by_table to 0. As a result the above call join->get_sort_by_join_tab() returns 0 and the code join->simple_order= 0; is skipped. So the value of join->simple_order is not changed and it remains the same as it was set by the previously called function remove_const that does not use the value of join->sort_by_table. After the above 'if' statement the function setup_semijoin_dups_elimination() calls init_dups_weedout(). The latter sets JOIN_TAB::keep_current_rowid to TRUE for the third joined table, namely for the table t2. It also sets the field JOIN::flush_weedout_table for this table.
The execution plan chosen for the query requires a sequential scan of the table t2. At the beginning of this scan the function SJ_TMP_TABLE::sj_weedout_delete_rows() is called. This function deletes all rows from the temporary table used for weedout. The function is called only once.
Then the execution process tries to build rows of the result set. It starts from reading the first record from t2. Yet as the join was marked with the flag simple_order the needed fields of the records of t2 first are sorted and placed into a buffer. So actually the records of t2 are read from this buffer by the call-back function rr_unpack_from_buffer(). After a record of t2 has been read into the record buffer for t2 its rowid is saved into the temporary table used for weedout due to setting of keep_current_rowid for t2 to true. However rr_unpack_from_buffer() cannot read this rowid from the buffer because it has never been written there. So it reads 'something' that is interpreted as a rowid and this 'something' is always the same. That's why the result set contains only one row no matter how many rows contains t2.

Igor Babaev (Inactive) added a comment - 2018-01-12 06:26 - edited The code of get_sort_by_table() that checks that ordering can be done using fields of only one table while ((table= ti++) && !(map & table->table->map)) ; if (map != table->table->map) DBUG_RETURN(0); // More than one table was incorrect. The code uses the previously calculated value of 'map' containing table_map bits for all tables the order by items depend on. It includes bits for all const (single_row) tables that should be excluded before the the loop in the above code starts. map&= ~const_tables; while ((table= ti++) && !(map & table->table->map)) ; if (map != table->table->map) DBUG_RETURN(0); // More than one table The function get_sort_by_table() is called in make_join_statistics() to set the value of JOIN::sort_by_table. Later this value can be used by the function setup_semijoin_dups_elimination() in the following code. (The function JOIN::JOIN_TAB *get_sort_by_join_tab() returns 0 if sort_by_table set to 0.) /* Make sure that possible sorting of rows from the head table is not to be employed. */ if (join->get_sort_by_join_tab()) { join->simple_order= 0; join->simple_group= 0; join->need_tmp= join->test_if_need_tmp_table(); } This code is executed for the reported query. The code prohibits using sorting by the first non constant joined table and, as a result, it requires creation of a temporary table containing all rows from the result set. Sorting the records from the temporary table produces the necessary ordering. Let's see what's going on when executing the reported query with the current code. The call of the function get_sort_by_table() set JOIN::sort_by_table to 0. As a result the above call join->get_sort_by_join_tab() returns 0 and the code join->simple_order= 0; is skipped. So the value of join->simple_order is not changed and it remains the same as it was set by the previously called function remove_const that does not use the value of join->sort_by_table. After the above 'if' statement the function setup_semijoin_dups_elimination() calls init_dups_weedout(). The latter sets JOIN_TAB::keep_current_rowid to TRUE for the third joined table, namely for the table t2. It also sets the field JOIN::flush_weedout_table for this table. The execution plan chosen for the query requires a sequential scan of the table t2. At the beginning of this scan the function SJ_TMP_TABLE::sj_weedout_delete_rows() is called. This function deletes all rows from the temporary table used for weedout. The function is called only once. Then the execution process tries to build rows of the result set. It starts from reading the first record from t2. Yet as the join was marked with the flag simple_order the needed fields of the records of t2 first are sorted and placed into a buffer. So actually the records of t2 are read from this buffer by the call-back function rr_unpack_from_buffer(). After a record of t2 has been read into the record buffer for t2 its rowid is saved into the temporary table used for weedout due to setting of keep_current_rowid for t2 to true. However rr_unpack_from_buffer() cannot read this rowid from the buffer because it has never been written there. So it reads 'something' that is interpreted as a rowid and this 'something' is always the same. That's why the result set contains only one row no matter how many rows contains t2.

Igor Babaev (Inactive) made changes - 2018-01-12 06:42

Fix Version/s

10.0 [ 16000 ]

Igor Babaev (Inactive) made changes - 2018-01-12 06:43

Affects Version/s

10.1.30 [ 22637 ]

Igor Babaev (Inactive) made changes - 2018-01-12 08:23

Assignee	Igor Babaev [ igor ]	Sergei Petrunia [ psergey ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Igor Babaev (Inactive) made changes - 2018-01-12 20:00

Assignee

Sergei Petrunia [ psergey ]

Igor Babaev [ igor ]

Igor Babaev (Inactive) made changes - 2018-01-12 20:01

Assignee

Igor Babaev [ igor ]

Oleksandr Byelkin [ sanja ]

Oleksandr Byelkin added a comment - 2018-01-12 21:08

OK to push.

Oleksandr Byelkin added a comment - 2018-01-12 21:08 OK to push.

Oleksandr Byelkin made changes - 2018-01-12 21:08

Assignee	Oleksandr Byelkin [ sanja ]	Igor Babaev [ igor ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Sergei Golubchik made changes - 2018-01-15 11:50

Component/s		Optimizer [ 10200 ]
Fix Version/s		5.5.59 [ 22612 ]
Fix Version/s	5.5 [ 15800 ]
Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

Igor Babaev (Inactive) made changes - 2018-01-18 18:14

Resolution	Fixed [ 1 ]
Status	Closed [ 6 ]	Stalled [ 10000 ]

Igor Babaev (Inactive) added a comment - 2018-01-18 18:18

A fix for this bug was pushed into the 5.5 tree.

Igor Babaev (Inactive) added a comment - 2018-01-18 18:18 A fix for this bug was pushed into the 5.5 tree.

Igor Babaev (Inactive) made changes - 2018-01-18 18:18

Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

Sergei Golubchik made changes - 2021-12-06 21:40

Workflow

MariaDB v3 [ 64799 ]

MariaDB v4 [ 148205 ]

People

Assignee:: Igor Babaev (Inactive)

Reporter:: Elena Stepanova

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2014-09-06 12:10

Updated:: 2018-01-18 18:18

Resolved:: 2018-01-18 18:18

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Activity

People

Dates

Git Integration