So after all the investigation we find the bug that we convert a subquery into a non-mergeable semi-join even when we should not.
Conversion of an IN subquery predicate into semi-join
is blocked now if the predicate occurs:
- in the ON expression of an outer join
- in the ON expression of an inner join embedded directly
or indirectly in the inner nest of an outer join
Currently in the function find_and_block_conversion_to_sj(), we block the conversion of a subquery to a semi-join when we have an IN subquery predicated in the on_expr of an outer join
But this does not handle the case when the IN subquery predicate is wrapped inside an Item_in_optimizer object.
So we need to disable the conversion even if we have an Item_in_optimizer object inside the on_expr of an outer join.
Varun Gupta (Inactive)
added a comment - So after all the investigation we find the bug that we convert a subquery into a non-mergeable semi-join even when we should not.
Conversion of an IN subquery predicate into semi-join
is blocked now if the predicate occurs:
- in the ON expression of an outer join
- in the ON expression of an inner join embedded directly
or indirectly in the inner nest of an outer join
Currently in the function find_and_block_conversion_to_sj(), we block the conversion of a subquery to a semi-join when we have an IN subquery predicated in the on_expr of an outer join
if (to_find->type() != Item::SUBSELECT_ITEM ||
((Item_subselect *) to_find)->substype() != Item_subselect::IN_SUBS)
But this does not handle the case when the IN subquery predicate is wrapped inside an Item_in_optimizer object.
So we need to disable the conversion even if we have an Item_in_optimizer object inside the on_expr of an outer join.
I created MDEV-15523, with that case I have the same (wrong) result when I do `set optimizer_switch='materialization=off'`
Was it expected that doing this fixed the issue? If that is the case then my bug is something different.
Sander Hoentjen
added a comment - - edited I created MDEV-15523 , with that case I have the same (wrong) result when I do `set optimizer_switch='materialization=off'`
Was it expected that doing this fixed the issue? If that is the case then my bug is something different.
The result depends on the value of optimizer_switch, but it should not, it is a bug and setting value to 'materalization=off' is only a temporary solution, only until the bug will be fixed. I checked testcase from MDEV-15523 with the default values of variables, it is the same bug as MDEV-14779, but maybe you have other switches, that also affected the result.
MariaDB [test]> CREATE TABLE `t1` (
-> `id` int(11) NOT NULL AUTO_INCREMENT,
-> `works` VARCHAR(1) DEFAULT NULL,
-> PRIMARY KEY (`id`)
-> );
Query OK, 0 rows affected (0.08 sec)
MariaDB [test]> INSERT INTO `t1` VALUES (1,'y'),(2,'n');
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0
MariaDB [test]> DROP TABLE IF EXISTS `t2`;
Query OK, 0 rows affected (0.06 sec)
MariaDB [test]> CREATE TABLE `t2` (
-> `id` int(11) NOT NULL AUTO_INCREMENT,
-> `t1_id` int(11) DEFAULT NULL,
-> PRIMARY KEY (`id`)
-> );
Query OK, 0 rows affected (0.07 sec)
MariaDB [test]> INSERT INTO `t2` VALUES (1,1),(2,1),(3,3),(4,3);
Query OK, 4 rows affected (0.01 sec)
Records: 4 Duplicates: 0 Warnings: 0
MariaDB [test]> SELECT * FROM t1 LEFT OUTER JOIN (SELECT t1_id FROM t2 WHERE id IN (SELECT MAX(id) FROM t2 GROUP BY t1_id)) tmpname on tmpname.t1_id = t1.id;
+----+-------+-------+
| id | works | t1_id |
+----+-------+-------+
| 1 | y | 1 |
+----+-------+-------+
1 row in set (0.02 sec)
MariaDB [test]> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> SELECT * FROM t1 LEFT OUTER JOIN (SELECT t1_id FROM t2 WHERE id IN (SELECT MAX(id) FROM t2 GROUP BY t1_id)) tmpname on tmpname.t1_id = t1.id;
Alice Sherepa
added a comment - The result depends on the value of optimizer_switch, but it should not, it is a bug and setting value to 'materalization=off' is only a temporary solution, only until the bug will be fixed. I checked testcase from MDEV-15523 with the default values of variables, it is the same bug as MDEV-14779 , but maybe you have other switches, that also affected the result.
MariaDB [test]> CREATE TABLE `t1` (
-> `id` int(11) NOT NULL AUTO_INCREMENT,
-> `works` VARCHAR(1) DEFAULT NULL,
-> PRIMARY KEY (`id`)
-> );
Query OK, 0 rows affected (0.08 sec)
MariaDB [test]> INSERT INTO `t1` VALUES (1,'y'),(2,'n');
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0
MariaDB [test]> DROP TABLE IF EXISTS `t2`;
Query OK, 0 rows affected (0.06 sec)
MariaDB [test]> CREATE TABLE `t2` (
-> `id` int(11) NOT NULL AUTO_INCREMENT,
-> `t1_id` int(11) DEFAULT NULL,
-> PRIMARY KEY (`id`)
-> );
Query OK, 0 rows affected (0.07 sec)
MariaDB [test]> INSERT INTO `t2` VALUES (1,1),(2,1),(3,3),(4,3);
Query OK, 4 rows affected (0.01 sec)
Records: 4 Duplicates: 0 Warnings: 0
MariaDB [test]> SELECT * FROM t1 LEFT OUTER JOIN (SELECT t1_id FROM t2 WHERE id IN (SELECT MAX(id) FROM t2 GROUP BY t1_id)) tmpname on tmpname.t1_id = t1.id;
+----+-------+-------+
| id | works | t1_id |
+----+-------+-------+
| 1 | y | 1 |
+----+-------+-------+
1 row in set (0.02 sec)
MariaDB [test]> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> SELECT * FROM t1 LEFT OUTER JOIN (SELECT t1_id FROM t2 WHERE id IN (SELECT MAX(id) FROM t2 GROUP BY t1_id)) tmpname on tmpname.t1_id = t1.id;
+----+-------+-------+
| id | works | t1_id |
+----+-------+-------+
| 1 | y | 1 |
| 2 | n | NULL |
+----+-------+-------+
2 rows in set (0.00 sec)
MariaDB [test]> show variables like '%switch%'\G
*************************** 1. row ***************************
Variable_name: optimizer_switch
Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on
1 row in set (0.00 sec)
So after all the investigation we find the bug that we convert a subquery into a non-mergeable semi-join even when we should not.
Conversion of an IN subquery predicate into semi-join
is blocked now if the predicate occurs:
- in the ON expression of an outer join
- in the ON expression of an inner join embedded directly
or indirectly in the inner nest of an outer join
Currently in the function find_and_block_conversion_to_sj(), we block the conversion of a subquery to a semi-join when we have an IN subquery predicated in the on_expr of an outer join
((Item_subselect *) to_find)->substype() != Item_subselect::IN_SUBS)
But this does not handle the case when the IN subquery predicate is wrapped inside an Item_in_optimizer object.
So we need to disable the conversion even if we have an Item_in_optimizer object inside the on_expr of an outer join.