Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
Here is a test case demonstrating the issue
h3 Query 1:
CREATE TABLE t1(a VARCHAR(50) collate utf8_general_ci, b INT); |
INSERT INTO t1 VALUES ('abc',1), ('def', 2), ('ghi', 3), ('jkl', 4), ('mno', 5); |
|
CREATE TABLE t2(a VARCHAR(50) collate utf8mb4_general_ci, b INT); |
INSERT INTO t2 VALUES ('abc',1), ('def', 2), ('ghi', 3), ('jkl', 4), ('mno', 5); |
MariaDB [test]> EXPLAIN SELECT * FROM t1 WHERE t1.a IN (SELECT t2.a FROM t2);
|
+------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------------------+
|
| 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 5 | |
|
| 1 | PRIMARY | t2 | ALL | NULL | NULL | NULL | NULL | 5 | Using where; FirstMatch(t1); Using join buffer (flat, BNL join) |
|
+------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------------------+
|
2 rows in set (0.004 sec)
|
|
So here we don't use materialization strategy, lets look in the optimizer trace why this happens:
"transformation": {
|
"select_id": 2,
|
"from": "IN (SELECT)",
|
"to": "materialization",
|
"possible": false,
|
"cause": "types mismatch"
|
}
|
So the reason why materialization does not happen is because there is a type mistmatch for the IN comparison. The column
t2.a is utf8mb4_general_ci and t1.a us utf8_general_ci. So we could allow materilazation when their is a mistmatch of datatypes.
IN predicate to IN subquery optimization would benefit from this too.
IN-predicate to IN subquery conversion
MariaDB [test]> create table t1(a int); |
Query OK, 0 rows affected (0.047 sec) |
|
MariaDB [test]> insert into t1 select seq from seq_1_to_100; |
Query OK, 100 rows affected (0.025 sec) |
|
MariaDB [test]> set in_predicate_conversion_threshold=9; |
Query OK, 0 rows affected (0.001 sec) |
|
Query 2:
select * from t1 where a in ('0','1','2','3','4','5','6','7','8','9')
the column a has different datatype as the elements in the IN list.
MariaDB [test]> explain select * from t1 where a in ('0','1','2','3','4','5','6','7','8','9');
|
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
|
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 100 | Using where |
|
+------+-------------+-------+------+---------------+------+---------+------+------+-------------+
|
1 row in set (0.002 sec)
|
For query 2 it would be good if we allow materialization strategy and create the materialized table fields with the collations
used to the comparison of the IN predicate or IN subquery.This way, current collation-based limitations will be lifted.
With this task we could allow materialization strategy for further cases.
Attachments
Issue Links
- relates to
-
MDEV-21265 IN predicate conversion to IN subquery should be allowed for a broader set of datatype comparison
- Closed