Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18898

SELECT using wrong index when using operator IN with mixed types

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Duplicate
    • 10.1.37, 10.1.38, 5.5(EOL), 10.0(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL)
    • N/A
    • Experienced on CentOS 7 (centos-release-7-5.1804.5.el7.centos.x86_64) and able to reproduce also on WSL Ubuntu 18.04 under Win10 Insiders build 18351

    Description

      Hello good people,

      I accidentaly stumbled upon a weird behavior of SELECT query optimizer, which doesn't seem to handle values of mixed types very well when using operator IN in WHERE condition. This ultimately results in suboptimal usage of indexes, even though sufficient information is IMHO available for handling the query better (i.e. using correct index).

      I was able to come up with MVCE of this sort:

      CREATE TEMPORARY TABLE __t1 (
        id int(10) unsigned NOT NULL AUTO_INCREMENT,
        name varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
        PRIMARY KEY (`id`),
        UNIQUE KEY `name` (`name`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
      INSERT INTO __t1 SELECT seq, MD5(seq) FROM seq_1_to_500000;
       
      ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
      ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Uses wrong index
      ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Uses wrong index
      ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
      ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
      ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
      ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
      ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
      ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
      ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
      

      Resulting in:

      > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | __t1  | range | PRIMARY       | PRIMARY | 4       | NULL |    2 |   2.00 |   100.00 |     100.00 | Using where |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.00 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Wrong index
      +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | r_rows     | filtered | r_filtered | Extra                    |
      +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
      |    1 | SIMPLE      | __t1  | index | PRIMARY       | name | 302     | NULL | 996723 | 1000000.00 |   100.00 |       0.00 | Using where; Using index |
      +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
      1 row in set (0.21 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Wrong index
      +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | r_rows     | filtered | r_filtered | Extra                    |
      +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
      |    1 | SIMPLE      | __t1  | index | PRIMARY       | name | 302     | NULL | 996723 | 1000000.00 |   100.00 |       0.00 | Using where; Using index |
      +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
      1 row in set (0.22 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | __t1  | range | PRIMARY       | PRIMARY | 4       | NULL |    2 |   2.00 |   100.00 |     100.00 | Using where |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.00 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
      +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | r_rows | filtered | r_filtered | Extra |
      +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
      |    1 | SIMPLE      | __t1  | const | PRIMARY       | PRIMARY | 4       | const |    1 |   NULL |   100.00 |       NULL |       |
      +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
      1 row in set (0.00 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
      +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | r_rows | filtered | r_filtered | Extra |
      +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
      |    1 | SIMPLE      | __t1  | const | PRIMARY       | PRIMARY | 4       | const |    1 |   NULL |   100.00 |       NULL |       |
      +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
      1 row in set (0.00 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | __t1  | range | PRIMARY       | PRIMARY | 4       | NULL |    2 |   2.00 |   100.00 |     100.00 | Using where |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.00 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | __t1  | range | PRIMARY       | PRIMARY | 4       | NULL |    2 |   2.00 |   100.00 |     100.00 | Using where |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.00 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | __t1  | range | PRIMARY       | PRIMARY | 4       | NULL |    2 |   2.00 |   100.00 |     100.00 | Using where |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.00 sec)
       
      > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | __t1  | range | PRIMARY       | PRIMARY | 4       | NULL |    2 |   2.00 |   100.00 |     100.00 | Using where |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.00 sec)
      

      As you can see the second and third SELECT queries employ wrong index and thus both result in (unnecessary?) full scan, even though it is clear that the condition is based solely on the id column, which acts as a primary key.

      SELECT queries with mixed types (integer and string) inside IN condition are the ones affected by this suboptimal index usage.

      SELECTs with values of a single common type use correct indexes - even if they're not matching the id column's type (integer), e.g. when values inside IN are numeric strings. These queries are handled well.

      SELECTs with values of mixed types used with the OR operator instead (the last four queries) also use correct the index. These queries are handled well, too.

      The performance difference there is not trivial - and it becomes worse with bigger tables.

      One might argue that one should just always use the correct types of values with these conditions, but there are environments in the wild that don't really care about differences between integers and numeric strings - one prime example being PHP. These can (and will) pass these values in unexpected and inconsistent fashion. In fact this is how I found out about this.

      (I was unable test this with different versions of MariaDB than those mentioned above. Also I apologize in advance if I made any wrong assumptions about the MariaDB's internals.)

      Attachments

        Issue Links

          Activity

            smuuf Přemysl Karbula created issue -
            smuuf Přemysl Karbula made changes -
            Field Original Value New Value
            Description Hello good people,

            I accidentaly stumbled upon a weird (?) behavior of SELECT query optimizer, which *doesn't seem to handle values of mixed types very well when using operator IN* in WHERE condition. This ultimately results in suboptimal usage of indexes, even though sufficient information is IMHO available for handling the query better (i.e. using correct index).

            I was able to come up with MVCE of this sort:

            {code:sql}
            -- Preparation
            DROP TABLE IF EXISTS __t1;
            CREATE TABLE __t1 (
              id int(10) unsigned NOT NULL AUTO_INCREMENT,
              name varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
              PRIMARY KEY (`id`),
              UNIQUE KEY `name` (`name`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
            INSERT INTO __t1 SELECT seq, MD5(seq) FROM seq_1_to_500000;

            -- Preparation
            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            {code}

            Resulting in:

            {noformat}
            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.21 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.22 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)
            {noformat}

            As you can see *the second and third SELECT queries employ wrong index and thus both result in (unnecessary?) full scan*, even though it is clear that the condition is based solely on the {{id}} column, which acts as a primary key.

            SELECT queries with mixed types (integer and string) inside IN condition are the ones affected by this suboptimal index usage.

            SELECTs with values of a single common type use correct indexes - even if they're not matching the {{id}} column's type (integer), e.g. when values inside IN are numeric strings. These queries are handled well.

            SELECTs with values of mixed types used with the OR operator instead _(the last four queries)_also use correct the index. These queries are handled well, too.

            *The performance difference there is not trivial* - and it becomes worse with bigger tables.

            One might argue that one should just always use the correct types of values with these conditions, but there are environments in the wild that don't really care about differences between integers and numeric strings _- one prime example being PHP_. These can (and will) pass these values in unexpected and inconsistent fashion. In fact this is how I found out about this.

            _(I was unable test this with different versions of MariaDB than those mentioned above. Also I apologize in advance if I made any wrong assumptions about the MariaDB's internals.)_
            Hello good people,

            I accidentaly stumbled upon a weird (?) behavior of SELECT query optimizer, which *doesn't seem to handle values of mixed types very well when using operator IN* in WHERE condition. This ultimately results in suboptimal usage of indexes, even though sufficient information is IMHO available for handling the query better (i.e. using correct index).

            I was able to come up with MVCE of this sort:

            {code:sql}
            DROP TABLE IF EXISTS __t1;
            CREATE TABLE __t1 (
              id int(10) unsigned NOT NULL AUTO_INCREMENT,
              name varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
              PRIMARY KEY (`id`),
              UNIQUE KEY `name` (`name`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
            INSERT INTO __t1 SELECT seq, MD5(seq) FROM seq_1_to_500000;

            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            {code}

            Resulting in:

            {noformat}
            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.21 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.22 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)
            {noformat}

            As you can see *the second and third SELECT queries employ wrong index and thus both result in (unnecessary?) full scan*, even though it is clear that the condition is based solely on the {{id}} column, which acts as a primary key.

            SELECT queries with mixed types (integer and string) inside IN condition are the ones affected by this suboptimal index usage.

            SELECTs with values of a single common type use correct indexes - even if they're not matching the {{id}} column's type (integer), e.g. when values inside IN are numeric strings. These queries are handled well.

            SELECTs with values of mixed types used with the OR operator instead _(the last four queries)_also use correct the index. These queries are handled well, too.

            *The performance difference there is not trivial* - and it becomes worse with bigger tables.

            One might argue that one should just always use the correct types of values with these conditions, but there are environments in the wild that don't really care about differences between integers and numeric strings _- one prime example being PHP_. These can (and will) pass these values in unexpected and inconsistent fashion. In fact this is how I found out about this.

            _(I was unable test this with different versions of MariaDB than those mentioned above. Also I apologize in advance if I made any wrong assumptions about the MariaDB's internals.)_
            smuuf Přemysl Karbula made changes -
            Description Hello good people,

            I accidentaly stumbled upon a weird (?) behavior of SELECT query optimizer, which *doesn't seem to handle values of mixed types very well when using operator IN* in WHERE condition. This ultimately results in suboptimal usage of indexes, even though sufficient information is IMHO available for handling the query better (i.e. using correct index).

            I was able to come up with MVCE of this sort:

            {code:sql}
            DROP TABLE IF EXISTS __t1;
            CREATE TABLE __t1 (
              id int(10) unsigned NOT NULL AUTO_INCREMENT,
              name varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
              PRIMARY KEY (`id`),
              UNIQUE KEY `name` (`name`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
            INSERT INTO __t1 SELECT seq, MD5(seq) FROM seq_1_to_500000;

            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            {code}

            Resulting in:

            {noformat}
            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.21 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.22 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)
            {noformat}

            As you can see *the second and third SELECT queries employ wrong index and thus both result in (unnecessary?) full scan*, even though it is clear that the condition is based solely on the {{id}} column, which acts as a primary key.

            SELECT queries with mixed types (integer and string) inside IN condition are the ones affected by this suboptimal index usage.

            SELECTs with values of a single common type use correct indexes - even if they're not matching the {{id}} column's type (integer), e.g. when values inside IN are numeric strings. These queries are handled well.

            SELECTs with values of mixed types used with the OR operator instead _(the last four queries)_also use correct the index. These queries are handled well, too.

            *The performance difference there is not trivial* - and it becomes worse with bigger tables.

            One might argue that one should just always use the correct types of values with these conditions, but there are environments in the wild that don't really care about differences between integers and numeric strings _- one prime example being PHP_. These can (and will) pass these values in unexpected and inconsistent fashion. In fact this is how I found out about this.

            _(I was unable test this with different versions of MariaDB than those mentioned above. Also I apologize in advance if I made any wrong assumptions about the MariaDB's internals.)_
            Hello good people,

            I accidentaly stumbled upon a weird (?) behavior of SELECT query optimizer, which *doesn't seem to handle values of mixed types very well when using operator IN* in WHERE condition. This ultimately results in suboptimal usage of indexes, even though sufficient information is IMHO available for handling the query better (i.e. using correct index).

            I was able to come up with MVCE of this sort:

            {code:sql}
            DROP TABLE IF EXISTS __t1;
            CREATE TABLE __t1 (
              id int(10) unsigned NOT NULL AUTO_INCREMENT,
              name varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
              PRIMARY KEY (`id`),
              UNIQUE KEY `name` (`name`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
            INSERT INTO __t1 SELECT seq, MD5(seq) FROM seq_1_to_500000;

            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            {code}

            Resulting in:

            {noformat}
            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.21 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.22 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)
            {noformat}

            As you can see *the second and third SELECT queries employ wrong index and thus both result in (unnecessary?) full scan*, even though it is clear that the condition is based solely on the {{id}} column, which acts as a primary key.

            SELECT queries with mixed types (integer and string) inside IN condition are the ones affected by this suboptimal index usage.

            SELECTs with values of a single common type use correct indexes - even if they're not matching the {{id}} column's type (integer), e.g. when values inside IN are numeric strings. These queries are handled well.

            SELECTs with values of mixed types used with the OR operator instead _(the last four queries)_ also use correct the index. These queries are handled well, too.

            *The performance difference there is not trivial* - and it becomes worse with bigger tables.

            One might argue that one should just always use the correct types of values with these conditions, but there are environments in the wild that don't really care about differences between integers and numeric strings _- one prime example being PHP_. These can (and will) pass these values in unexpected and inconsistent fashion. In fact this is how I found out about this.

            _(I was unable test this with different versions of MariaDB than those mentioned above. Also I apologize in advance if I made any wrong assumptions about the MariaDB's internals.)_
            smuuf Přemysl Karbula made changes -
            Description Hello good people,

            I accidentaly stumbled upon a weird (?) behavior of SELECT query optimizer, which *doesn't seem to handle values of mixed types very well when using operator IN* in WHERE condition. This ultimately results in suboptimal usage of indexes, even though sufficient information is IMHO available for handling the query better (i.e. using correct index).

            I was able to come up with MVCE of this sort:

            {code:sql}
            DROP TABLE IF EXISTS __t1;
            CREATE TABLE __t1 (
              id int(10) unsigned NOT NULL AUTO_INCREMENT,
              name varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
              PRIMARY KEY (`id`),
              UNIQUE KEY `name` (`name`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
            INSERT INTO __t1 SELECT seq, MD5(seq) FROM seq_1_to_500000;

            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            {code}

            Resulting in:

            {noformat}
            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.21 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.22 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)
            {noformat}

            As you can see *the second and third SELECT queries employ wrong index and thus both result in (unnecessary?) full scan*, even though it is clear that the condition is based solely on the {{id}} column, which acts as a primary key.

            SELECT queries with mixed types (integer and string) inside IN condition are the ones affected by this suboptimal index usage.

            SELECTs with values of a single common type use correct indexes - even if they're not matching the {{id}} column's type (integer), e.g. when values inside IN are numeric strings. These queries are handled well.

            SELECTs with values of mixed types used with the OR operator instead _(the last four queries)_ also use correct the index. These queries are handled well, too.

            *The performance difference there is not trivial* - and it becomes worse with bigger tables.

            One might argue that one should just always use the correct types of values with these conditions, but there are environments in the wild that don't really care about differences between integers and numeric strings _- one prime example being PHP_. These can (and will) pass these values in unexpected and inconsistent fashion. In fact this is how I found out about this.

            _(I was unable test this with different versions of MariaDB than those mentioned above. Also I apologize in advance if I made any wrong assumptions about the MariaDB's internals.)_
            Hello good people,

            I accidentaly stumbled upon a weird (?) behavior of SELECT query optimizer, which *doesn't seem to handle values of mixed types very well when using operator IN* in WHERE condition. This ultimately results in suboptimal usage of indexes, even though sufficient information is IMHO available for handling the query better (i.e. using correct index).

            I was able to come up with MVCE of this sort:

            {code:sql}
            CREATE TEMPORARY TABLE __t1 (
              id int(10) unsigned NOT NULL AUTO_INCREMENT,
              name varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
              PRIMARY KEY (`id`),
              UNIQUE KEY `name` (`name`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
            INSERT INTO __t1 SELECT seq, MD5(seq) FROM seq_1_to_500000;

            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Uses wrong index
            ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            {code}

            Resulting in:

            {noformat}
            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, 2);
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', 2); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.21 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN (1, '2'); -- Wrong index
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            | 1 | SIMPLE | __t1 | index | PRIMARY | name | 302 | NULL | 996723 | 1000000.00 | 100.00 | 0.00 | Using where; Using index |
            +------+-------------+-------+-------+---------------+------+---------+------+--------+------------+----------+------------+--------------------------+
            1 row in set (0.22 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id IN ('1', '2');
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1;
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            | 1 | SIMPLE | __t1 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL | 100.00 | NULL | |
            +------+-------------+-------+-------+---------------+---------+---------+-------+------+--------+----------+------------+-------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = 1 OR id = '2';
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)

            > ANALYZE SELECT id, name FROM __t1 WHERE id = '1' OR id = 2;
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            | 1 | SIMPLE | __t1 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 2.00 | 100.00 | 100.00 | Using where |
            +------+-------------+-------+-------+---------------+---------+---------+------+------+--------+----------+------------+-------------+
            1 row in set (0.00 sec)
            {noformat}

            As you can see *the second and third SELECT queries employ wrong index and thus both result in (unnecessary?) full scan*, even though it is clear that the condition is based solely on the {{id}} column, which acts as a primary key.

            SELECT queries with mixed types (integer and string) inside IN condition are the ones affected by this suboptimal index usage.

            SELECTs with values of a single common type use correct indexes - even if they're not matching the {{id}} column's type (integer), e.g. when values inside IN are numeric strings. These queries are handled well.

            SELECTs with values of mixed types used with the OR operator instead _(the last four queries)_ also use correct the index. These queries are handled well, too.

            *The performance difference there is not trivial* - and it becomes worse with bigger tables.

            One might argue that one should just always use the correct types of values with these conditions, but there are environments in the wild that don't really care about differences between integers and numeric strings _- one prime example being PHP_. These can (and will) pass these values in unexpected and inconsistent fashion. In fact this is how I found out about this.

            _(I was unable test this with different versions of MariaDB than those mentioned above. Also I apologize in advance if I made any wrong assumptions about the MariaDB's internals.)_
            elenst Elena Stepanova made changes -
            Fix Version/s 10.1 [ 16100 ]
            Assignee Oleksandr Byelkin [ sanja ]
            serg Sergei Golubchik made changes -
            Assignee Oleksandr Byelkin [ sanja ] Alexander Barkov [ bar ]
            serg Sergei Golubchik made changes -
            Priority Minor [ 4 ] Major [ 3 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            bar Alexander Barkov made changes -
            Affects Version/s 5.5 [ 15800 ]
            Affects Version/s 10.0 [ 16000 ]
            bar Alexander Barkov made changes -
            Affects Version/s 10.2 [ 14601 ]
            Affects Version/s 10.3 [ 22126 ]
            Affects Version/s 10.4 [ 22408 ]
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.1 [ 16100 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 93208 ] MariaDB v4 [ 141196 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.2 [ 14601 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.3 [ 22126 ]
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            issue.field.resolutiondate 2024-04-09 12:43:00.0 2024-04-09 12:42:59.633
            bar Alexander Barkov made changes -
            Fix Version/s N/A [ 14700 ]
            Fix Version/s 10.4 [ 22408 ]
            Resolution Duplicate [ 3 ]
            Status Open [ 1 ] Closed [ 6 ]

            People

              bar Alexander Barkov
              smuuf Přemysl Karbula
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.