Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15310

Range optimizer does not work well for "WHERE temporal_column NOT IN (const_list)"

    XMLWordPrintable

Details

    Description

      I create a table with a TIME column and populate it with some data, and run an EXPLAIN for a SELECT with a ``WHERE a NOT IN()'' condition:

      CREATE OR REPLACE TABLE t1 (a TIME, filler CHAR(200), KEY(a));
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:01', 'no');
      INSERT INTO t1 VALUES ('23:00:02', 'no');
      INSERT INTO t1 VALUES ('23:00:03', 'yes');
      INSERT INTO t1 VALUES ('23:00:04', 'yes');
      EXPLAIN SELECT * FROM t1 WHERE a NOT IN ('23:00:01','23:00:02');
      DROP TABLE t1;
      

      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      |    1 | SIMPLE      | t1    | range | a             | a    | 4       | NULL |    4 | Using index condition |
      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      

      Notice, the condition covers only two records. The optimizer correctly choses to use the range search.

      Now I do the same for negative TIME values:

      CREATE OR REPLACE TABLE t1 (a TIME, filler CHAR(200), KEY(a));
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:01', 'no');
      INSERT INTO t1 VALUES ('-23:00:02', 'no');
      INSERT INTO t1 VALUES ('-23:00:03', 'yes');
      INSERT INTO t1 VALUES ('-23:00:04', 'yes');
      EXPLAIN SELECT * FROM t1 WHERE a NOT IN ('-23:00:01','-23:00:02');
      DROP TABLE t1;
      

      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | a             | NULL | NULL    | NULL |   27 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      

      As in the first example, the condition matches only two records again, but the range search does not work any more and it goes through the full table scan.

      Now I do the same for positive time values, but outside of the time-of-the-day range, greater than 23:59:59.

      CREATE OR REPLACE TABLE t1 (a TIME, filler CHAR(200), KEY(a));
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:01', 'no');
      INSERT INTO t1 VALUES ('24:00:02', 'no');
      INSERT INTO t1 VALUES ('24:00:03', 'yes');
      INSERT INTO t1 VALUES ('24:00:04', 'yes');
      EXPLAIN SELECT * FROM t1 WHERE a NOT IN ('24:00:01','24:00:02');
      DROP TABLE t1;
      

      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | a             | NULL | NULL    | NULL |   27 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      

      It erroneously uses the full table scan again.

      Now I do the same for a TIME(1) column with positive fractional data, within the time-of-the-day range:

      CREATE OR REPLACE TABLE t1 (a TIME(1), filler CHAR(200), KEY(a));
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('23:00:02.1', 'no');
      INSERT INTO t1 VALUES ('23:00:03.1', 'yes');
      INSERT INTO t1 VALUES ('23:00:04.1', 'yes');
      EXPLAIN SELECT * FROM t1 WHERE a NOT IN ('23:00:01.1','23:00:02.1');
      DROP TABLE t1;
      

      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | a             | NULL | NULL    | NULL |   27 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      

      Again, although the condition matches only two records, it erroneously goes through the full table scan.

      Now I do the same for a DATETIME column:

      CREATE OR REPLACE TABLE t1 (a DATETIME, filler CHAR(200), KEY(a));
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:02', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:03', 'yes');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:04', 'yes');
      EXPLAIN SELECT * FROM t1 WHERE a NOT IN ('2001-01-01 23:00:01','2001-01-01 23:00:02');
      DROP TABLE t1;
      

      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      |    1 | SIMPLE      | t1    | range | a             | a    | 6       | NULL |    4 | Using index condition |
      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      

      Looks good. It uses the range search.

      Now I do the same for a DATETIME(1) column, with fractional data:

      CREATE OR REPLACE TABLE t1 (a DATETIME(1), filler CHAR(200), KEY(a));
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:01.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:02.1', 'no');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:03.1', 'yes');
      INSERT INTO t1 VALUES ('2001-01-01 23:00:04.1', 'yes');
      EXPLAIN SELECT * FROM t1 WHERE a NOT IN ('2001-01-01 23:00:01.1','2001-01-01 23:00:02.1');
      DROP TABLE t1;
      

      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | a             | NULL | NULL    | NULL |   27 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      

      It erroneously goes through the full table scan.

      Attachments

        Activity

          People

            bar Alexander Barkov
            bar Alexander Barkov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.