I have two tables Person and Message and the latter has a foreign key to the former. Each table has id as the primary key column, and the Person table also has a column personId which is (uniquely) indexed.
The query below should take advantage of the personId key index, but instead MySQL requires scanning the entire Message table for some reason:
(This is because LEFT JOIN needs to find all records in `m`, regardless
of whether they have a match in `p`).
When we use that join order, the ON expression allows to access table `p`
using eq_ref access method (that is, a lookup on a primary key).
However, the condition `p`.`personId` = 'M002649397' is not useful.
If we take a careful look at the WHERE clause, we can see that is only
satisfied when
`p`.`personId` = 'M002649397'
This means that we're not interested in results of LEFT JOIN that have a
NULL-complemented row for table `p`. In other words, we can replace the
LEFT JOIN with INNER JOIN.
Once we have INNER JOIN, we can use both join orders:
m, p
p, m
When we use the join order "p, m" (like the second EXPLAIN does), we can
use condition to limit the number of rows we get from table `p`.
The problem here seems to be that the optimizer is unable to convert left
join into inner join.
When the WHERE clause is just
`p`.`personId` = 'M002649397'
then conversion works (check out EXPLAIN EXTENDED - it shows "JOIN")
When the WHERE clause is
'M002649397' IS NULL OR `p`.`personId` = 'M002649397';
then conversion doesn't work.
Sergei Petrunia
added a comment - The query has form
m LEFT JOIN p ON ...
LEFT JOIN allows only one join order:
m, p.
(This is because LEFT JOIN needs to find all records in `m`, regardless
of whether they have a match in `p`).
When we use that join order, the ON expression allows to access table `p`
using eq_ref access method (that is, a lookup on a primary key).
However, the condition `p`.`personId` = 'M002649397' is not useful.
If we take a careful look at the WHERE clause, we can see that is only
satisfied when
`p`.`personId` = 'M002649397'
This means that we're not interested in results of LEFT JOIN that have a
NULL-complemented row for table `p`. In other words, we can replace the
LEFT JOIN with INNER JOIN.
Once we have INNER JOIN, we can use both join orders:
m, p
p, m
When we use the join order "p, m" (like the second EXPLAIN does), we can
use condition to limit the number of rows we get from table `p`.
The problem here seems to be that the optimizer is unable to convert left
join into inner join.
When the WHERE clause is just
`p`.`personId` = 'M002649397'
then conversion works (check out EXPLAIN EXTENDED - it shows "JOIN")
When the WHERE clause is
'M002649397' IS NULL OR `p`.`personId` = 'M002649397';
then conversion doesn't work.
MariaDB [j9]> EXPLAIN SELECT `m`.*
-> FROM
-> `Message` AS `m`
-> LEFT JOIN
-> `Person` AS `p` ON (`m`.`person` = `p`.`id`)
-> WHERE
-> 'M002649397' IS NULL OR
-> `p`.`personId` = 'M002649397';
---------------------------------------------------------------------------------+
How to fill the test dataset:
insert into Person (personID, lastName) select concat('id-', a), concat('id-', a) from test.one_k;
insert into Message (person) select a from test.one_k where a>0;
insert into Message (person) select a from test.one_k where a>0;
Sergei Petrunia
added a comment - How to fill the test dataset:
insert into Person (personID, lastName) select concat('id-', a), concat('id-', a) from test.one_k;
insert into Message (person) select a from test.one_k where a>0;
insert into Message (person) select a from test.one_k where a>0;
One may ask, doesn't MySQL (or MariaDB) optimizer has module that removes parts of WHERE condition that are known to be false. Yes, it does. The problem is that outer-to-inner join conversion step is run before the constant-condition-removal module is run (there are reasons for this).
Sergei Petrunia
added a comment - One may ask, doesn't MySQL (or MariaDB) optimizer has module that removes parts of WHERE condition that are known to be false. Yes, it does. The problem is that outer-to-inner join conversion step is run before the constant-condition-removal module is run (there are reasons for this).
(gdb) set $first= ((Item_cond_or*)conds)>argument_list()>head()
(gdb) p $first->basic_const_item()
$19 = false
(gdb) p $first->used_tables()
$20 = 0
(gdb) p $first->is_expensive()
$21 = false
Sergei Petrunia
added a comment - (gdb) set $first= ((Item_cond_or*)conds) >argument_list() >head()
(gdb) p $first->basic_const_item()
$19 = false
(gdb) p $first->used_tables()
$20 = 0
(gdb) p $first->is_expensive()
$21 = false
from having item->not_null_tables() put into not_null_tables_cache.
Sergei Petrunia
added a comment - Maybe, we could remove items that have
item->const_item() && !item->is_expensive() && item->val_int() == 0
from having item->not_null_tables() put into not_null_tables_cache.
The query has form
m LEFT JOIN p ON ...
LEFT JOIN allows only one join order:
m, p.
(This is because LEFT JOIN needs to find all records in `m`, regardless
of whether they have a match in `p`).
When we use that join order, the ON expression allows to access table `p`
using eq_ref access method (that is, a lookup on a primary key).
However, the condition `p`.`personId` = 'M002649397' is not useful.
If we take a careful look at the WHERE clause, we can see that is only
satisfied when
`p`.`personId` = 'M002649397'
This means that we're not interested in results of LEFT JOIN that have a
NULL-complemented row for table `p`. In other words, we can replace the
LEFT JOIN with INNER JOIN.
Once we have INNER JOIN, we can use both join orders:
When we use the join order "p, m" (like the second EXPLAIN does), we can
to limit the number of rows we get from table `p`.
use condition
The problem here seems to be that the optimizer is unable to convert left
join into inner join.
When the WHERE clause is just
`p`.`personId` = 'M002649397'
then conversion works (check out EXPLAIN EXTENDED - it shows "JOIN")
When the WHERE clause is
'M002649397' IS NULL OR `p`.`personId` = 'M002649397';
then conversion doesn't work.