Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8229

GROUP_MIN_MAX is erroneously applied for BETWEEN in some cases

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 5.5(EOL), 10.0(EOL), 10.1(EOL)
    • 10.1.6
    • Optimizer
    • None
    • 10.1.6-1

    Description

      This problem is similar to MDEV-6991, but now for BETWEEN.

      This script:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (id INT NOT NULL, a VARCHAR(20)) ENGINE=MyISAM;
      INSERT INTO t1 VALUES (1,'2001-01-01');
      INSERT INTO t1 VALUES (1,'2001-01-02');
      INSERT INTO t1 VALUES (1,'2001-01-03');
      INSERT INTO t1 VALUES (1,' 2001-01-04');
      INSERT INTO t1 VALUES (2,'2001-01-01');
      INSERT INTO t1 VALUES (2,'2001-01-02');
      INSERT INTO t1 VALUES (2,'2001-01-03');
      INSERT INTO t1 VALUES (2,' 2001-01-04');
      INSERT INTO t1 VALUES (3,'2001-01-01');
      INSERT INTO t1 VALUES (3,'2001-01-02');
      INSERT INTO t1 VALUES (3,'2001-01-03');
      INSERT INTO t1 VALUES (3,' 2001-01-04');
      INSERT INTO t1 VALUES (4,'2001-01-01');
      INSERT INTO t1 VALUES (4,'2001-01-02');
      INSERT INTO t1 VALUES (4,'2001-01-03');
      INSERT INTO t1 VALUES (4,' 2001-01-04');
      SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;

      correctly returns:

      +----+-------------+-------------+
      | id | MIN(a)      | MAX(a)      |
      +----+-------------+-------------+
      |  1 |  2001-01-04 |  2001-01-04 |
      |  2 |  2001-01-04 |  2001-01-04 |
      |  3 |  2001-01-04 |  2001-01-04 |
      |  4 |  2001-01-04 |  2001-01-04 |
      +----+-------------+-------------+

      Notice, there are leading spaces in the records with '2001-01-04', .i.e. '<space>2001-01-04', and no leading spaces in the other records.

      Now if I add an index and run the query again:

      ALTER TABLE t1 ADD KEY(id,a);
      SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;

      it returns empty set.

      This EXPLAIN:

      EXPLAIN SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;

      tells that group-min-max optimization is used for the query

      +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                 |
      +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
      |    1 | SIMPLE      | t1    | range | NULL          | id   | 27      | NULL |    9 | Using where; Using index for group-by |
      +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+

      This is wrong, because DATE comparison cannot correctly use a VARCHAR index.
      GROUP_MIN_MAX optimization should not be used for this query.

      Attachments

        Activity

          bar Alexander Barkov created issue -
          bar Alexander Barkov made changes -
          Field Original Value New Value
          Description This problems is similar to MDEV-6991, but now for BETWEEN.

          This script:
          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (id INT NOT NULL, a VARCHAR(20)) ENGINE=MyISAM;
          INSERT INTO t1 VALUES (1,'2001-01-01');
          INSERT INTO t1 VALUES (1,'2001-01-02');
          INSERT INTO t1 VALUES (1,'2001-01-03');
          INSERT INTO t1 VALUES (1,' 2001-01-04');
          INSERT INTO t1 VALUES (2,'2001-01-01');
          INSERT INTO t1 VALUES (2,'2001-01-02');
          INSERT INTO t1 VALUES (2,'2001-01-03');
          INSERT INTO t1 VALUES (2,' 2001-01-04');
          INSERT INTO t1 VALUES (3,'2001-01-01');
          INSERT INTO t1 VALUES (3,'2001-01-02');
          INSERT INTO t1 VALUES (3,'2001-01-03');
          INSERT INTO t1 VALUES (3,' 2001-01-04');
          INSERT INTO t1 VALUES (4,'2001-01-01');
          INSERT INTO t1 VALUES (4,'2001-01-02');
          INSERT INTO t1 VALUES (4,'2001-01-03');
          INSERT INTO t1 VALUES (4,' 2001-01-04');
          SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
          correctly returns:
          {noformat}
          +----+-------------+-------------+
          | id | MIN(a) | MAX(a) |
          +----+-------------+-------------+
          | 1 | 2001-01-04 | 2001-01-04 |
          | 2 | 2001-01-04 | 2001-01-04 |
          | 3 | 2001-01-04 | 2001-01-04 |
          | 4 | 2001-01-04 | 2001-01-04 |
          +----+-------------+-------------+
          {noformat}
          Notice, there are leading spaces in the records with '2001-01-04', .i.e. '<space>2001-01-04', and no leading spaces in the other records.

          Now if I add an index and run the query again:

          {code}
          ALTER TABLE t1 ADD KEY(id,a);
          SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
          it returns empty set.

          This EXPLAIN:
          {code}
          EXPLAIN SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
           tells that group-min-max optimization used for the query
          {noformat}
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          | 1 | SIMPLE | t1 | range | NULL | id | 27 | NULL | 9 | Using where; Using index for group-by |
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          {noformat}
          This is wrong, because DATE comparison cannot correctly use a VARCHAR index.
          GROUP_MIN_MAX optimization should not be used for this query.
          This problems is similar to MDEV-6991, but now for BETWEEN.

          This script:
          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (id INT NOT NULL, a VARCHAR(20)) ENGINE=MyISAM;
          INSERT INTO t1 VALUES (1,'2001-01-01');
          INSERT INTO t1 VALUES (1,'2001-01-02');
          INSERT INTO t1 VALUES (1,'2001-01-03');
          INSERT INTO t1 VALUES (1,' 2001-01-04');
          INSERT INTO t1 VALUES (2,'2001-01-01');
          INSERT INTO t1 VALUES (2,'2001-01-02');
          INSERT INTO t1 VALUES (2,'2001-01-03');
          INSERT INTO t1 VALUES (2,' 2001-01-04');
          INSERT INTO t1 VALUES (3,'2001-01-01');
          INSERT INTO t1 VALUES (3,'2001-01-02');
          INSERT INTO t1 VALUES (3,'2001-01-03');
          INSERT INTO t1 VALUES (3,' 2001-01-04');
          INSERT INTO t1 VALUES (4,'2001-01-01');
          INSERT INTO t1 VALUES (4,'2001-01-02');
          INSERT INTO t1 VALUES (4,'2001-01-03');
          INSERT INTO t1 VALUES (4,' 2001-01-04');
          SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
          correctly returns:
          {noformat}
          +----+-------------+-------------+
          | id | MIN(a) | MAX(a) |
          +----+-------------+-------------+
          | 1 | 2001-01-04 | 2001-01-04 |
          | 2 | 2001-01-04 | 2001-01-04 |
          | 3 | 2001-01-04 | 2001-01-04 |
          | 4 | 2001-01-04 | 2001-01-04 |
          +----+-------------+-------------+
          {noformat}
          Notice, there are leading spaces in the records with '2001-01-04', .i.e. '<space>2001-01-04', and no leading spaces in the other records.

          Now if I add an index and run the query again:

          {code}
          ALTER TABLE t1 ADD KEY(id,a);
          SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
          it returns empty set.

          This EXPLAIN:
          {code}
          EXPLAIN SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
           tells that group-min-max optimization is used for the query
          {noformat}
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          | 1 | SIMPLE | t1 | range | NULL | id | 27 | NULL | 9 | Using where; Using index for group-by |
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          {noformat}
          This is wrong, because DATE comparison cannot correctly use a VARCHAR index.
          GROUP_MIN_MAX optimization should not be used for this query.
          bar Alexander Barkov made changes -
          Description This problems is similar to MDEV-6991, but now for BETWEEN.

          This script:
          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (id INT NOT NULL, a VARCHAR(20)) ENGINE=MyISAM;
          INSERT INTO t1 VALUES (1,'2001-01-01');
          INSERT INTO t1 VALUES (1,'2001-01-02');
          INSERT INTO t1 VALUES (1,'2001-01-03');
          INSERT INTO t1 VALUES (1,' 2001-01-04');
          INSERT INTO t1 VALUES (2,'2001-01-01');
          INSERT INTO t1 VALUES (2,'2001-01-02');
          INSERT INTO t1 VALUES (2,'2001-01-03');
          INSERT INTO t1 VALUES (2,' 2001-01-04');
          INSERT INTO t1 VALUES (3,'2001-01-01');
          INSERT INTO t1 VALUES (3,'2001-01-02');
          INSERT INTO t1 VALUES (3,'2001-01-03');
          INSERT INTO t1 VALUES (3,' 2001-01-04');
          INSERT INTO t1 VALUES (4,'2001-01-01');
          INSERT INTO t1 VALUES (4,'2001-01-02');
          INSERT INTO t1 VALUES (4,'2001-01-03');
          INSERT INTO t1 VALUES (4,' 2001-01-04');
          SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
          correctly returns:
          {noformat}
          +----+-------------+-------------+
          | id | MIN(a) | MAX(a) |
          +----+-------------+-------------+
          | 1 | 2001-01-04 | 2001-01-04 |
          | 2 | 2001-01-04 | 2001-01-04 |
          | 3 | 2001-01-04 | 2001-01-04 |
          | 4 | 2001-01-04 | 2001-01-04 |
          +----+-------------+-------------+
          {noformat}
          Notice, there are leading spaces in the records with '2001-01-04', .i.e. '<space>2001-01-04', and no leading spaces in the other records.

          Now if I add an index and run the query again:

          {code}
          ALTER TABLE t1 ADD KEY(id,a);
          SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
          it returns empty set.

          This EXPLAIN:
          {code}
          EXPLAIN SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
           tells that group-min-max optimization is used for the query
          {noformat}
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          | 1 | SIMPLE | t1 | range | NULL | id | 27 | NULL | 9 | Using where; Using index for group-by |
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          {noformat}
          This is wrong, because DATE comparison cannot correctly use a VARCHAR index.
          GROUP_MIN_MAX optimization should not be used for this query.
          This problem is similar to MDEV-6991, but now for BETWEEN.

          This script:
          {code}
          DROP TABLE IF EXISTS t1;
          CREATE TABLE t1 (id INT NOT NULL, a VARCHAR(20)) ENGINE=MyISAM;
          INSERT INTO t1 VALUES (1,'2001-01-01');
          INSERT INTO t1 VALUES (1,'2001-01-02');
          INSERT INTO t1 VALUES (1,'2001-01-03');
          INSERT INTO t1 VALUES (1,' 2001-01-04');
          INSERT INTO t1 VALUES (2,'2001-01-01');
          INSERT INTO t1 VALUES (2,'2001-01-02');
          INSERT INTO t1 VALUES (2,'2001-01-03');
          INSERT INTO t1 VALUES (2,' 2001-01-04');
          INSERT INTO t1 VALUES (3,'2001-01-01');
          INSERT INTO t1 VALUES (3,'2001-01-02');
          INSERT INTO t1 VALUES (3,'2001-01-03');
          INSERT INTO t1 VALUES (3,' 2001-01-04');
          INSERT INTO t1 VALUES (4,'2001-01-01');
          INSERT INTO t1 VALUES (4,'2001-01-02');
          INSERT INTO t1 VALUES (4,'2001-01-03');
          INSERT INTO t1 VALUES (4,' 2001-01-04');
          SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
          correctly returns:
          {noformat}
          +----+-------------+-------------+
          | id | MIN(a) | MAX(a) |
          +----+-------------+-------------+
          | 1 | 2001-01-04 | 2001-01-04 |
          | 2 | 2001-01-04 | 2001-01-04 |
          | 3 | 2001-01-04 | 2001-01-04 |
          | 4 | 2001-01-04 | 2001-01-04 |
          +----+-------------+-------------+
          {noformat}
          Notice, there are leading spaces in the records with '2001-01-04', .i.e. '<space>2001-01-04', and no leading spaces in the other records.

          Now if I add an index and run the query again:

          {code}
          ALTER TABLE t1 ADD KEY(id,a);
          SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
          it returns empty set.

          This EXPLAIN:
          {code}
          EXPLAIN SELECT id,MIN(a),MAX(a) FROM t1 WHERE a BETWEEN DATE'2001-01-04' AND DATE'2001-01-05' GROUP BY id;
          {code}
           tells that group-min-max optimization is used for the query
          {noformat}
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          | 1 | SIMPLE | t1 | range | NULL | id | 27 | NULL | 9 | Using where; Using index for group-by |
          +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
          {noformat}
          This is wrong, because DATE comparison cannot correctly use a VARCHAR index.
          GROUP_MIN_MAX optimization should not be used for this query.
          serg Sergei Golubchik made changes -
          Issue Type Task [ 3 ] Bug [ 1 ]
          bar Alexander Barkov added a comment - - edited

          In 5.5 one should use DATE('XXXX-XX-XX') syntax instead of DATE'XXXX-XX-XX' to reproduce the problem.

          bar Alexander Barkov added a comment - - edited In 5.5 one should use DATE('XXXX-XX-XX') syntax instead of DATE'XXXX-XX-XX' to reproduce the problem.
          bar Alexander Barkov made changes -
          Affects Version/s 10.1 [ 16100 ]
          Affects Version/s 10.0 [ 16000 ]
          Affects Version/s 5.5 [ 15800 ]
          serg Sergei Golubchik made changes -
          Fix Version/s 5.5 [ 15800 ]
          Fix Version/s 10.0 [ 16000 ]
          serg Sergei Golubchik made changes -
          Sprint 10.1.6-1 [ 6 ]
          bar Alexander Barkov made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          bar Alexander Barkov made changes -
          Fix Version/s 10.1.6 [ 19401 ]
          Fix Version/s 10.1 [ 16100 ]
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 67715 ] MariaDB v4 [ 149213 ]

          People

            bar Alexander Barkov
            bar Alexander Barkov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.