[MDEV-31303] Key not used when IN clause has both signed and unsigned values Created: 2023-05-18  Updated: 2023-08-29  Resolved: 2023-08-29

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.3.38, 10.4.28, 10.6.13
Fix Version/s: 10.5.23, 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3, 11.2.2

Type: Bug Priority: Major
Reporter: Iliya Polihronov Assignee: Vicențiu Ciorbaru
Resolution: Fixed Votes: 0
Labels: None
Environment:

Debian



 Description   

When using IN() operator, if the provided values are both within the 1st half and the 2nd half of the column size ( < 9223372036854775808 and >= 9223372036854775808 for unsigned bigint), the index is not used.

This behavior has changed from the 10.1.* versions where the index was correctly used.

SQL to reproduce:

CREATE TABLE `a` (
  `id` bigint AUTO_INCREMENT PRIMARY KEY,
  `c1` bigint unsigned,
  KEY (`c1`)
) ENGINE=InnoDB;
 
INSERT INTO `a` VALUES (1,9223382399205928659),(2,9223384207280813348),(3,9223385953115437234),(4,9223387250780556749),(5,9223387354282558788),(6,9223387603870501596),(7,9223389270813433667),(8,9223389903231468827),(9,9223390280789586779),(10,9223391591398222899),(11,9223391875473564350),(12,9223393152250049433),(13,9223393939696790223),(14,9223394417225350415),(15,9223397646397141015),(16,9223398025879291243),(17,9223399038671098072),(18,9223399534968874556),(19,9223400449518009285),(20,9223400860292643549),(21,9223400940692256924),(22,9223401073791948119),(23,9223402820804649616),(24,9223403470951992681),(25,9223405581879567267),(26,9223405754978563829),(27,9223405972966828221);
 
MariaDB [test]> explain SELECT c1 FROM a WHERE c1 IN ( 1, 9223372036854775807 );
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|    1 | SIMPLE      | a     | range | c1            | c1   | 9       | NULL |    2 | Using where; Using index |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
 
 
MariaDB [test]> explain SELECT c1 FROM a WHERE c1 IN ( 1, 9223372036854775808 );
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|    1 | SIMPLE      | a     | index | c1            | c1   | 9       | NULL |   27 | Using where; Using index |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+



 Comments   
Comment by Vicențiu Ciorbaru [ 2023-06-06 ]

Preliminary investigation in this:

This is related to the type system. The fix will require a bit more analysis, however I have a workaround:

Cast all values in the IN clause as UNSIGNED.

explain SELECT c1 FROM a WHERE c1 IN ( 1, 9223372036854775808 );
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	a	index	c1	c1	9	NULL	27	Using where; Using index
explain SELECT c1 FROM a WHERE c1 IN ( CAST(1 AS UNSIGNED), CAST(9223372036854775808 AS UNSIGNED));
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	a	range	c1	c1	9	NULL	2	Using where; Using index

This will make the optimizer know that the types are compatible. I'll work on a proper fix for this.

Comment by Vicențiu Ciorbaru [ 2023-08-28 ]

Hi bar!

Please review this patch that fixes a "regression" introduced by MDEV-11497.
There isn't necessarily a problem with MDEV-11497, however the changes there introduced some limitations to the optimizer, that were previously not present.

Ideally, I'd like to push this to 10.4, so it fixes the user's problem, but please review this in the context of 11.3 for now and if you think that it is safe enough, we can push this to older versions.

I have tried various approaches to make this work and this is the cleanest I could come up with, without taking apart the multiple-inheritance that Item_func_in has.

I've explained the bugfix in the commit message, as well as code comments. Let me know if it is clear enough.

https://github.com/MariaDB/server/pull/2740

Comment by Alexander Barkov [ 2023-08-29 ]

Hi cvicentiu,

Please review a modified version of the patch implementing (as discussed on slack) it in a slightly different way:

  • There is no a need to aggregate for comparison all (predicant,valueN) pairs again to detect a mix of integers with different signess. We can just analyze the state of Predicant_to_list_comparator after} the {{prepare_predicant_and_values() call. Then we can iterate through all individual arguments to check if they are all integers.
  • The class in_longlong supports a mixture of integers of different signess. There is no a need to use in_decimal.

https://github.com/MariaDB/server/commit/3096cdb57b3c5b3c0c38e4538186f5c6f5fc4d03

The patch is good starting from 10.5. For 10.4 it would need some additional changes, let's skip 10.4.

Thanks.

Comment by Vicențiu Ciorbaru [ 2023-08-29 ]

Hi bar.

Patch looks good to me. Buildbot should clear it first of course, but if no other tests fail, this fixes the issue.

Generated at Thu Feb 08 10:22:49 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.