Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.4.14, 10.4(EOL), 10.5
Description
When the number of conditions inside the "IN" clause matches the value of in_predicate_conversion_threshold , query behavior changes and returned result is incorrect for query.
Attaching file for case reproduction:
CREATE DATABASE subscriber_data;
mysql subscriber_data < subscriber_data.sql
Execute:
pager;
|
set in_predicate_conversion_threshold = 51; |
pager md5sum;
|
|
SELECT object, ind, data FROM `subscriber_data`.`dev_config` AS live JOIN (SELECT object, ind, MAX(collection_time) AS collection_time FROM `subscriber_data`.`dev_config` WHERE collection_time <= '2020-09-24' AND ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51) GROUP BY object, ind) AS latest USING (object, ind, collection_time) WHERE ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51); |
Result is
13 rows in set (0.039 sec)
Otherwise setting set in_predicate_conversion_threshold = 1000; (value bigger than number of clauses inside "IN" ):
pager;
|
set in_predicate_conversion_threshold = 1000; |
pager md5sum;
|
|
SELECT object, ind, data FROM `subscriber_data`.`dev_config` AS live JOIN (SELECT object, ind, MAX(collection_time) AS collection_time FROM `subscriber_data`.`dev_config` WHERE collection_time <= '2020-09-24' AND ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51) GROUP BY object, ind) AS latest USING (object, ind, collection_time) WHERE ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51); |
Returns
700 rows in set (0.038 sec)
The returned result when bug triggers (in_predicate_conversion_threshold = 51) is:
-------------------------------------------
object | ind | data |
-------------------------------------------
1 | 51 | 51 |
2 | 51 | 0 |
3 | 51 | us-east-1d |
4 | 51 | ebd81398742e75f2fafb5511e06720c3 |
5 | 51 | 0 |
6 | 51 | 0 |
7 | 51 | 0 |
8 | 51 | 9b5d0e93ea552fb8a86420e87410ae21 |
10 | 51 | NULL |
11 | 51 | 0 |
12 | 51 | 0 |
13 | 51 | 0 |
14 | 51 | 0 |
-------------------------------------------
13 rows in set (0.046 sec)
But if I reorder the values inside the query IN clause and set "1" at the end instead of 51, result is:
-------------------------------------------
object | ind | data |
-------------------------------------------
1 | 1 | 1 |
2 | 1 | 0 |
3 | 1 | test_by_vcug_device |
4 | 1 | 6dc8e300391a4730609c89d9bf5e7564 |
5 | 1 | 0 |
6 | 1 | NULL |
7 | 1 | 0 |
8 | 1 | 8d4ffb6e30134a25b5297975eb3c0155 |
10 | 1 | NULL |
11 | 1 | 0 |
12 | 1 | 0 |
13 | 1 | 0 |
14 | 1 | 0 |
-------------------------------------------
13 rows in set (0.040 sec)
It can be seen that on the group by "ind" , the result "ind" value displayed is only the one last used in the "IN" conditions, and there seems to is a "UNION ALL" missing to merge all results when in_predicate_conversion_threshold kicks in.
Starting from MariaDB version 11.0 the test case provided in my previous comment starts working correctly.
The plan for this test case differs from the plan in the earlier versions: the column of the transformed IN-predicate is not used in the lateral derived.
JOIN
(
USING (groupId, id)
| cnt |
| 6 |
EXPLAIN
JOIN
(
USING (groupId, id)