|
Let's set the system join_cache_level to 4 and create and populate tables t1,t2,t3,t4 with the following commands:
set join_cache_level=4;
|
CREATE TABLE t1 (a int) ENGINE=MyISAM;
|
INSERT INTO t1 VALUES (1),(2);
|
CREATE TABLE t2 (b int, c int) ENGINE=MyISAM;
|
INSERT INTO t2 VALUES (1,2),(2,4);
|
CREATE TABLE t3 (d int, KEY(d)) ENGINE=MyISAM;
|
INSERT INTO t3 VALUES (1),(2);
|
CREATE TABLE t4 (e int primary key) ENGINE=MyISAM;
|
INSERT INTO t4 VALUES (1),(2);
|
ANALYZE TABLE t1,t2,t3,t4;
|
After this we have:
MariaDB [test]> SELECT * FROM t1 LEFT JOIN ( ( t2 LEFT JOIN t3 ON c = d ) JOIN t4 ) ON b = e;
|
+------+------+------+------+------+
|
| a | b | c | d | e |
|
+------+------+------+------+------+
|
| 1 | 1 | 2 | 2 | 1 |
|
| 2 | 1 | 2 | 2 | 1 |
|
| 1 | 2 | 4 | NULL | 2 |
|
| 2 | 2 | 4 | NULL | 2 |
|
+------+------+------+------+------+
|
and the result set is correct as we also have:
MariaDB [test]> SELECT * FROM t2 LEFT JOIN t3 ON c = d;
|
+------+------+------+
|
| b | c | d |
|
+------+------+------+
|
| 1 | 2 | 2 |
|
| 2 | 4 | NULL |
|
+------+------+------+
|
|
MariaDB [test]> SELECT * FROM (t2 LEFT JOIN t3 ON c = d) JOIN t4;
|
+------+------+------+---+
|
| b | c | d | e |
|
+------+------+------+---+
|
| 1 | 2 | 2 | 1 |
|
| 2 | 4 | NULL | 1 |
|
| 1 | 2 | 2 | 2 |
|
| 2 | 4 | NULL | 2 |
|
+------+------+------+---+
|
Yet here we have a wrong result:
MariaDB [test]> SELECT * FROM t1 LEFT JOIN ( ( t2 LEFT JOIN t3 ON c = d ) JOIN t4 ) ON b = e
|
-> WHERE e IS NULL;
|
+------+------+------+------+------+
|
| a | b | c | d | e |
|
+------+------+------+------+------+
|
| 1 | NULL | NULL | NULL | NULL |
|
| 2 | NULL | NULL | NULL | NULL |
|
+------+------+------+------+------+
|
This result is wrong as the query without the condition (e IS NULL) returns rows without NULLs in column e and the column is not nullable.
Let's look at the EXPLAIN for the query returning a wrong result set:
MariaDB [test]> EXPLAIN
|
-> SELECT * FROM t1 LEFT JOIN ( ( t2 LEFT JOIN t3 ON c = d ) JOIN t4 ) ON b = e
|
-> WHERE e IS NULL;
|
+------+-------------+-------+------------+---------------+-----------------------+---------+-----------+------+----------------------------------------------------------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------------+---------------+-----------------------+---------+-----------+------+----------------------------------------------------------------------------------+
|
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 2 | |
|
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 2 | Using where; Using join buffer (flat, BNL join) |
|
| 1 | SIMPLE | t3 | hash_index | d | #hash#d:d | 5:5 | test.t2.c | 2 | Using where; Using index; Using join buffer (incremental, BNLH join) |
|
| 1 | SIMPLE | t4 | hash_index | PRIMARY | #hash#PRIMARY:PRIMARY | 4:4 | test.t2.b | 2 | Using where; Using index; Not exists; Using join buffer (incremental, BNLH join) |
|
+------+-------------+-------+------------+---------------+-----------------------+---------+-----------+------+----------------------------------------------------------------------------------+
|
For the query without where condition the EXPLAIN output is:
MariaDB [test]> EXPLAIN
|
-> SELECT * FROM t1 LEFT JOIN ( ( t2 LEFT JOIN t3 ON c = d ) JOIN t4 ) ON b = e
|
-> ;
|
+------+-------------+-------+------------+---------------+-----------------------+---------+-----------+------+----------------------------------------------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------------+---------------+-----------------------+---------+-----------+------+----------------------------------------------------------------------+
|
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 2 | |
|
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 2 | Using where; Using join buffer (flat, BNL join) |
|
| 1 | SIMPLE | t3 | hash_index | d | #hash#d:d | 5:5 | test.t2.c | 2 | Using where; Using index; Using join buffer (incremental, BNLH join) |
|
| 1 | SIMPLE | t4 | hash_index | PRIMARY | #hash#PRIMARY:PRIMARY | 4:4 | test.t2.b | 2 | Using index; Using join buffer (incremental, BNLH join) |
|
+------+-------------+-------+------------+---------------+-----------------------+---------+-----------+------+----------------------------------------------------------------------+
|
The execution plans for both queries are similar. The only differences are in the forth lines.
|
|
Let's see in debugger how we get a wrong result.
1. After having processed the first table t1 we have in the join buffer attached to the second table t2 the rows:
Ecah row is supplied with a match flag because t1 is the left operand of a left join.
The we try to find matches from t2:
JOIN_CACHE::join_records() is called
|
JOIN_CACHE::join_matching_records() is called
|
JOIN_TAB_SCAN::open() is called to prepare reading from t2
|
The first record from t2 (1,2) is read
|
A scan of the rows in the the join buffer C1 starts
|
JOIN_CACHE::generate_full_extensions() is called to produce partial join (1), (1,2)
|
sub_select() is called that puts (1,2) linked to (1) inntto the second join buffer C2 attached to table t3
|
JOIN_CACHE::generate_full_extensions() is called to produce partial join (2), (1,2)
|
sub_select() is called that puts (1,2) linked to 2 to the second join buffer C2 attached to table t3
|
The second record from t2 (2,4) is read
|
A new scan of the rows in the the join buffer C1 starts
|
JOIN_CACHE::generate_full_extensions() is called to produce partial join (2), (2,4)
|
sub_select() is called that puts (2,4) linked to (2) to the second join buffer C2 attached to table t3.
|
JOIN_CACHE::generate_full_extensions() is called to produce partial join (2), (2,4)
|
sub_select() is called that puts (2,4) linked to (2) to the second join buffer C2 attached to table t3.
|
The join cache attached to table 3 is BNLH join cache so each row put in the buffer is supplied with hash key.
Now we have in join buffers C1 and C2
C2
|
/------ F (1,2)
|
C1 F (1) /---------F (1,2)
|
F (2) \---------F (2,4)
|
\-------F (2,4)
|
Here rows in C2 are also supplied with match flags as t2 is the left operand of an embedded left join.
2. Now we try to find matches from t3 for the rows in C2. (The beginning of each partial join whose end is in C2 is placed in C1).
JOIN_CACHE::join_records() is called
|
JOIN_CACHE::join_matching_records() is called
|
JOIN_TAB_SCAN::open() is called to prepare reading from t3
|
The first record from t3 (1) is read
|
JOIN_CACHE_BNLH::prepare_look_for_matches is called : no keys corresponding d=1 found in column c
|
The second record from t3 (2) is read
|
JOIN_CACHE_BNLH::prepare_look_for_matches is called: a key corresponding d=2 found in column c
|
Read the first row with such key (d=2) from C2
|
JOIN_CACHE::generate_full_extensions() is called to extent partial join (1),(1,2) by (2)
|
Set the match flag for the first record (1,2) in C2
|
sub_select_cache is called() to put (2) into the join buffer C3 attached to table t4
|
(2) is included in the hash chain created for key b=1
|
Read the next row with key d=2 from C2
|
JOIN_CACHE::generate_full_extensions() is called to extent partial join (1),(1,2) by (2)
|
Set the match flag for the second record (1,2) in C2
|
sub_select_cache is called() to put (2) into the join buffer C3 attached to table t4
|
(2) is included in the hash chain created for key b=1
|
JOIN_CACHE::join_null_complements() is called
|
A scan of the rows in the the join buffer C2 starts
|
Two first rows are skipped as there are matches for them
|
JOIN_CACHE::generate_full_extensions() for the third to extent partial join by NULL
|
sub_select_cache() is called to put the extent into join cache C3 with a hash key b=2
|
JOIN_CACHE::generate_full_extensions() for the fourth to extent partial join by NULL
|
sub_select_cache() is called to put the extent into join cache C3 with a hash key b=2
|
Now we have in join buffers C1, C2, C3
C2 C3
|
/------ T (1,2) ----- (2)
|
C1 F (1) /---------T (1,2) ------ (2)
|
F (2) \---------T (2,4) ----- (NULL)
|
\-------T (2,4) ----- (NULL)
|
3. Now we try to find matches from t4 for the rows in C3
JOIN_CACHE::join_records() is called
|
JOIN_CACHE::join_matching_records() is called
|
JOIN_TAB_SCAN::open is called is called to prepare reading from t4
|
The first record from t4 (1) is read
|
a key for b=1 is found C3
|
The first candidate for match from C3 is read F (1) -----T (1,2) ----- (2)
|
JOIN_CACHE_BNLH::skip_next_candidate_for_match() is called
|
JOIN_TAB::check_only_first_match() is called for t4 and returns true
|
JOIN_CACHE::get_match_flag_by_pos() is called for C3
|
JOIN_CACHE::get_match_flag_by_pos() is called for C2 and returns true
|
---- ^ here we have a problem because the match flag for the outer left join is in C1 and it's still FALSE
|
So we skipped the candidate from C3
|
The next candidate for match from C3 is read F (1) -----T (1,2) ----- (2)
|
It's skipped for the same reason.
|
No more candidates for the extension (1)
|
The second record from t4 (2) is read
|
a key for b=2 is found C3
|
Both candidates for a match are skipped for the same reason as the previous two
|
We return to the cache C1 and generated NULL complemented rows for outer left join
|