Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8661

Wrong result for SELECT..WHERE a='a' AND a='a' COLLATE latin1_bin

Details

    Description

      This bug is similar to http://bugs.mysql.com/bug.php?id=5134
      Note, the patch for MySQL bug#5134 fixed only a particular case of the problem when the BINARY keyword is used. The problem is in fact more general.

      This script:

      SET NAMES latin1;
      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (a CHAR(10));
      INSERT INTO t1 VALUES ('a'),('A');
      SELECT * FROM t1 WHERE a = 'a' COLLATE latin1_bin;

      correctly returns one row:

      +------+
      | a    |
      +------+
      | a    |
      +------+

      Now if I add an extra part into the condition:

      SELECT * FROM t1 WHERE a='a' AND a='a' COLLATE latin1_bin;

      it returns two rows:

      +------+
      | a    |
      +------+
      | a    |
      | A    |
      +------+

      The expected result is to return one row in both cases.

      The problem happens because "AND a='a' COLLATE latin1_bin" gets erroneously replaced to "AND 'a'='a' COLLATE latin1_bin" which is further evaluates to TRUE and gets removed from the WHERE condition. So, the query gets rewritten to:

      SELECT * FROM t1 WHERE a='a';

      The method which actually replaces the field to the constant is Item_field::equal_fields_propagator() in item.cc.

      This condition is not strict enough:

        if (!item || !has_compatible_context(item))
          item= this;

      It should also take into account the collations of the two operations that the field "a" appears in.

      Attachments

        Issue Links

          Activity

            bar Alexander Barkov created issue -
            bar Alexander Barkov made changes -
            Field Original Value New Value
            Description This script:
            {code}
            SET NAMES latin1;
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a CHAR(10));
            INSERT INTO t1 VALUES ('a'),('A');
            SELECT * FROM t1 WHERE a = 'a' COLLATE latin1_bin;
            {code}
            correctly returns one row:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            +------+
            {noformat}

            Now if I add an extra part into the condition:
            {code}
            SELECT * FROM t1 WHERE a='a' AND a='a' COLLATE latin1_bin;
            {code}
            it returns two rows:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            | A |
            +------+
            {noformat}

            This script:
            {code}
            SET NAMES latin1;
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a CHAR(10));
            INSERT INTO t1 VALUES ('a'),('A');
            SELECT * FROM t1 WHERE a = 'a' COLLATE latin1_bin;
            {code}
            correctly returns one row:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            +------+
            {noformat}

            Now if I add an extra part into the condition:
            {code}
            SELECT * FROM t1 WHERE a='a' AND a='a' COLLATE latin1_bin;
            {code}
            it returns two rows:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            | A |
            +------+
            {noformat}

            The expected result is to return one row in both cases.

            The problem happens because "AND a='a' COLLATE latin1_bin" gets erroneously replaced to "AND 'a'='a' COLLATE latin1_bin" which is further evaluates to TRUE and gets removed from the WHERE condition. So, the query gets rewritten to:
            {code}
            SELECT * FROM t1 WHERE a='a';
            {code}

            The method which actually replaces the field to the constant is Item_field::equal_fields_propagator() in item.cc.

            This condition is not strict enough:
            {code}
              if (!item || !has_compatible_context(item))
                item= this;
            {code}
            It should also take into account the collations of the two operations that the field "a" participates in.
            bar Alexander Barkov made changes -
            Description This script:
            {code}
            SET NAMES latin1;
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a CHAR(10));
            INSERT INTO t1 VALUES ('a'),('A');
            SELECT * FROM t1 WHERE a = 'a' COLLATE latin1_bin;
            {code}
            correctly returns one row:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            +------+
            {noformat}

            Now if I add an extra part into the condition:
            {code}
            SELECT * FROM t1 WHERE a='a' AND a='a' COLLATE latin1_bin;
            {code}
            it returns two rows:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            | A |
            +------+
            {noformat}

            The expected result is to return one row in both cases.

            The problem happens because "AND a='a' COLLATE latin1_bin" gets erroneously replaced to "AND 'a'='a' COLLATE latin1_bin" which is further evaluates to TRUE and gets removed from the WHERE condition. So, the query gets rewritten to:
            {code}
            SELECT * FROM t1 WHERE a='a';
            {code}

            The method which actually replaces the field to the constant is Item_field::equal_fields_propagator() in item.cc.

            This condition is not strict enough:
            {code}
              if (!item || !has_compatible_context(item))
                item= this;
            {code}
            It should also take into account the collations of the two operations that the field "a" participates in.
            This script:
            {code}
            SET NAMES latin1;
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a CHAR(10));
            INSERT INTO t1 VALUES ('a'),('A');
            SELECT * FROM t1 WHERE a = 'a' COLLATE latin1_bin;
            {code}
            correctly returns one row:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            +------+
            {noformat}

            Now if I add an extra part into the condition:
            {code}
            SELECT * FROM t1 WHERE a='a' AND a='a' COLLATE latin1_bin;
            {code}
            it returns two rows:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            | A |
            +------+
            {noformat}

            The expected result is to return one row in both cases.

            The problem happens because "AND a='a' COLLATE latin1_bin" gets erroneously replaced to "AND 'a'='a' COLLATE latin1_bin" which is further evaluates to TRUE and gets removed from the WHERE condition. So, the query gets rewritten to:
            {code}
            SELECT * FROM t1 WHERE a='a';
            {code}

            The method which actually replaces the field to the constant is Item_field::equal_fields_propagator() in item.cc.

            This condition is not strict enough:
            {code}
              if (!item || !has_compatible_context(item))
                item= this;
            {code}
            It should also take into account the collations of the two operations that the field "a" appears in.
            bar Alexander Barkov made changes -
            Description This script:
            {code}
            SET NAMES latin1;
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a CHAR(10));
            INSERT INTO t1 VALUES ('a'),('A');
            SELECT * FROM t1 WHERE a = 'a' COLLATE latin1_bin;
            {code}
            correctly returns one row:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            +------+
            {noformat}

            Now if I add an extra part into the condition:
            {code}
            SELECT * FROM t1 WHERE a='a' AND a='a' COLLATE latin1_bin;
            {code}
            it returns two rows:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            | A |
            +------+
            {noformat}

            The expected result is to return one row in both cases.

            The problem happens because "AND a='a' COLLATE latin1_bin" gets erroneously replaced to "AND 'a'='a' COLLATE latin1_bin" which is further evaluates to TRUE and gets removed from the WHERE condition. So, the query gets rewritten to:
            {code}
            SELECT * FROM t1 WHERE a='a';
            {code}

            The method which actually replaces the field to the constant is Item_field::equal_fields_propagator() in item.cc.

            This condition is not strict enough:
            {code}
              if (!item || !has_compatible_context(item))
                item= this;
            {code}
            It should also take into account the collations of the two operations that the field "a" appears in.
            This bug is similar to http://bugs.mysql.com/bug.php?id=5134
            Note, the patch for MySQL bug#5134 fixed only a particular case of the problem when the BINARY keyword is used. The problem is in fact more general.

            This script:
            {code}
            SET NAMES latin1;
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a CHAR(10));
            INSERT INTO t1 VALUES ('a'),('A');
            SELECT * FROM t1 WHERE a = 'a' COLLATE latin1_bin;
            {code}
            correctly returns one row:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            +------+
            {noformat}

            Now if I add an extra part into the condition:
            {code}
            SELECT * FROM t1 WHERE a='a' AND a='a' COLLATE latin1_bin;
            {code}
            it returns two rows:
            {noformat}
            +------+
            | a |
            +------+
            | a |
            | A |
            +------+
            {noformat}

            The expected result is to return one row in both cases.

            The problem happens because "AND a='a' COLLATE latin1_bin" gets erroneously replaced to "AND 'a'='a' COLLATE latin1_bin" which is further evaluates to TRUE and gets removed from the WHERE condition. So, the query gets rewritten to:
            {code}
            SELECT * FROM t1 WHERE a='a';
            {code}

            The method which actually replaces the field to the constant is Item_field::equal_fields_propagator() in item.cc.

            This condition is not strict enough:
            {code}
              if (!item || !has_compatible_context(item))
                item= this;
            {code}
            It should also take into account the collations of the two operations that the field "a" appears in.
            bar Alexander Barkov made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            bar Alexander Barkov made changes -
            Component/s Optimizer [ 10200 ]
            Fix Version/s 10.1.5 [ 18813 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            Labels propagation
            bar Alexander Barkov made changes -
            Labels propagation propagation upstream
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 71171 ] MariaDB v4 [ 149501 ]

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.