Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.4.14, 10.4(EOL), 10.5
Description
When the number of conditions inside the "IN" clause matches the value of in_predicate_conversion_threshold , query behavior changes and returned result is incorrect for query.
Attaching file for case reproduction:
CREATE DATABASE subscriber_data;
mysql subscriber_data < subscriber_data.sql
Execute:
pager;
|
set in_predicate_conversion_threshold = 51; |
pager md5sum;
|
 |
SELECT object, ind, data FROM `subscriber_data`.`dev_config` AS live JOIN (SELECT object, ind, MAX(collection_time) AS collection_time FROM `subscriber_data`.`dev_config` WHERE collection_time <= '2020-09-24' AND ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51) GROUP BY object, ind) AS latest USING (object, ind, collection_time) WHERE ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51); |
Result is
13 rows in set (0.039 sec)
Otherwise setting set in_predicate_conversion_threshold = 1000; (value bigger than number of clauses inside "IN" ):
pager;
|
set in_predicate_conversion_threshold = 1000; |
pager md5sum;
|
 |
SELECT object, ind, data FROM `subscriber_data`.`dev_config` AS live JOIN (SELECT object, ind, MAX(collection_time) AS collection_time FROM `subscriber_data`.`dev_config` WHERE collection_time <= '2020-09-24' AND ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51) GROUP BY object, ind) AS latest USING (object, ind, collection_time) WHERE ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51); |
Returns
700 rows in set (0.038 sec)
The returned result when bug triggers (in_predicate_conversion_threshold = 51) is:
-------------------------------------------
object | ind | data |
-------------------------------------------
1 | 51 | 51 |
2 | 51 | 0 |
3 | 51 | us-east-1d |
4 | 51 | ebd81398742e75f2fafb5511e06720c3 |
5 | 51 | 0 |
6 | 51 | 0 |
7 | 51 | 0 |
8 | 51 | 9b5d0e93ea552fb8a86420e87410ae21 |
10 | 51 | NULL |
11 | 51 | 0 |
12 | 51 | 0 |
13 | 51 | 0 |
14 | 51 | 0 |
-------------------------------------------
13 rows in set (0.046 sec)
But if I reorder the values inside the query IN clause and set "1" at the end instead of 51, result is:
-------------------------------------------
object | ind | data |
-------------------------------------------
1 | 1 | 1 |
2 | 1 | 0 |
3 | 1 | test_by_vcug_device |
4 | 1 | 6dc8e300391a4730609c89d9bf5e7564 |
5 | 1 | 0 |
6 | 1 | NULL |
7 | 1 | 0 |
8 | 1 | 8d4ffb6e30134a25b5297975eb3c0155 |
10 | 1 | NULL |
11 | 1 | 0 |
12 | 1 | 0 |
13 | 1 | 0 |
14 | 1 | 0 |
-------------------------------------------
13 rows in set (0.040 sec)
It can be seen that on the group by "ind" , the result "ind" value displayed is only the one last used in the "IN" conditions, and there seems to is a "UNION ALL" missing to merge all results when in_predicate_conversion_threshold kicks in.