Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6978

Bad results with join comparing case insensitive VARCHAR/ENUM/SET expression to a _bin ENUM column

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.2.29, 10.1.44, 10.3.22, 10.4.12, 10.5.1
    • 10.0.15
    • Optimizer
    • None

    Description

      DROP TABLE IF EXISTS t1,t2;
      CREATE TABLE t1 (a ENUM('a') CHARACTER SET latin1 PRIMARY KEY);
      INSERT INTO t1 VALUES ('a');
      DROP TABLE IF EXISTS t2;
      CREATE TABLE t2 (a ENUM('a','A') CHARACTER SET latin1 COLLATE latin1_bin);
      INSERT INTO t2 VALUES ('a'),('A');
      SELECT t1.* FROM t1 JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
      SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;

      This script correctly returns 2 rows for the both SELECT queries.

      +---+
      | a |
      +---+
      | a |
      | a |
      +---+
      2 rows in set (0.00 sec)

      If I now add a primary key on t2 and rerun the queries again:

      ALTER TABLE t2 ADD PRIMARY KEY(a);
      SELECT t1.* FROM t1 JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
      SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;

      both queries return only one rows:

      +---+
      | a |
      +---+
      | a |
      +---+
      1 row in set (0.00 sec)

      This is wrong. Two rows should always be returned.

      The same problem is repeatable if I change data type for t1.a from ENUM to VARCHAR(10) or SET('a').

      Attachments

        Issue Links

          Activity

            bar Alexander Barkov created issue -
            bar Alexander Barkov made changes -
            Field Original Value New Value
            Summary Bad results with joins comparing case insensitive and _bin ENUM columns Bad results with join comparing case insensitive VARCHAR/ENUM/SET expression to a _bin ENUM column
            bar Alexander Barkov made changes -
            Description {code}
            DROP TABLE IF EXISTS t1,t2;
            CREATE TABLE t1 (a ENUM('a') CHARACTER SET latin1 PRIMARY KEY);
            INSERT INTO t1 VALUES ('a');
            DROP TABLE IF EXISTS t2;
            CREATE TABLE t2 (a ENUM('a','A') CHARACTER SET latin1 COLLATE latin1_bin);
            INSERT INTO t2 VALUES ('a'),('A');
            SELECT t1.* FROM t1 JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
            SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
            {code}
            This script correctly returns 2 rows for the both SELECT queries.
            {noformat}
            +---+
            | a |
            +---+
            | a |
            | a |
            +---+
            2 rows in set (0.00 sec)
            {noformat}

            If I now add a primary key on t2 and rerun the queries again:
            {code:sql}
            ALTER TABLE t2 ADD PRIMARY KEY(a);
            SELECT t1.* FROM t1 JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
            SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
            {code}

            both queries return only one rows:
            {noformat}
            +---+
            | a |
            +---+
            | a |
            +---+
            1 row in set (0.00 sec)
            {noformat}

            This is wrong. Two rows should always be returned.
            {code}
            DROP TABLE IF EXISTS t1,t2;
            CREATE TABLE t1 (a ENUM('a') CHARACTER SET latin1 PRIMARY KEY);
            INSERT INTO t1 VALUES ('a');
            DROP TABLE IF EXISTS t2;
            CREATE TABLE t2 (a ENUM('a','A') CHARACTER SET latin1 COLLATE latin1_bin);
            INSERT INTO t2 VALUES ('a'),('A');
            SELECT t1.* FROM t1 JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
            SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
            {code}
            This script correctly returns 2 rows for the both SELECT queries.
            {noformat}
            +---+
            | a |
            +---+
            | a |
            | a |
            +---+
            2 rows in set (0.00 sec)
            {noformat}

            If I now add a primary key on t2 and rerun the queries again:
            {code:sql}
            ALTER TABLE t2 ADD PRIMARY KEY(a);
            SELECT t1.* FROM t1 JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
            SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.a COLLATE latin1_swedish_ci=t2.a;
            {code}

            both queries return only one rows:
            {noformat}
            +---+
            | a |
            +---+
            | a |
            +---+
            1 row in set (0.00 sec)
            {noformat}

            This is wrong. Two rows should always be returned.


            The same problem is repeatable if I change data type for t1.a from ENUM to VARCHAR(10) or SET('a').
            serg Sergei Golubchik made changes -
            Fix Version/s 10.0 [ 16000 ]
            bar Alexander Barkov made changes -
            Fix Version/s 10.0.15 [ 17300 ]
            Fix Version/s 10.0 [ 16000 ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Closed [ 6 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Workflow MariaDB v2 [ 57105 ] MariaDB v3 [ 65322 ]

            I've reopened this bug because I found a variant of the test case for MDEV-6978 that does not work properly.
            Let's change the definition of table t2 for

            CREATE TABLE t2 (c1 ENUM('a','A','b','B','c','C','d','D','e','E') CHARACTER SET latin1 COLLATE latin1_bin); 
            

            Now let's add one row

            INSERT INTO t2 VALUES ('a'),('A');
            

            After this let's execute the query from your test case

            SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.c1 COLLATE latin1_swedish_ci=t2.c1;
            

            We get the result

            MariaDB [test]> SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.c1 COLLATE latin1_swedish_ci=t2.c1;
            +----+
            | c1 |
            +----+
            | a  |
            | a  |
            +----+
            

            Now let's add more rows:

            INSERT INTO t2 VALUES ('b'),('B'),('c'),('C'),('d'),('D'),('e'),('E');
            

            The query returns the same result set as expected.
            Now let's add the primary key as in your test case

            ALTER TABLE t2 ADD PRIMARY KEY(c1);
            

            Now the query returns a wrong result set

            MariaDB [test]> SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.c1 COLLATE latin n1_swedish_ci=t2.c1;
            +----+
            | c1 |
            +----+
            | a  |
            +----+
            

            Note that now the query uses a plan different from that used in your test case after addition of the primary key. This plan employs a range index scan rather then a full index scan as in your test case.

            MariaDB [test]> EXPLAIN SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.c1 COLLATE latin1_swedish_ci=t2.c1;
            +------+-------------+-------+--------+---------------+---------+---------+------+------+--------------------------+
            | id   | select_type | table | type   | possible_keys | key     | key_len | ref  | rows | Extra                    |
            +------+-------------+-------+--------+---------------+---------+---------+------+------+--------------------------+
            |    1 | SIMPLE      | t1    | system | NULL          | NULL    | NULL    | NULL |    1 |                          |
            |    1 | SIMPLE      | t2    | range  | PRIMARY       | PRIMARY | 1       | NULL |    1 | Using where; Using index |
            +------+-------------+-------+--------+---------------+---------+---------+------+------+--------------------------+
            

            I reproduced the problem in the current 10.4 and 10.1. All other versions most probably are also affected.

            igor Igor Babaev (Inactive) added a comment - I've reopened this bug because I found a variant of the test case for MDEV-6978 that does not work properly. Let's change the definition of table t2 for CREATE TABLE t2 (c1 ENUM( 'a' , 'A' , 'b' , 'B' , 'c' , 'C' , 'd' , 'D' , 'e' , 'E' ) CHARACTER SET latin1 COLLATE latin1_bin); Now let's add one row INSERT INTO t2 VALUES ( 'a' ),( 'A' ); After this let's execute the query from your test case SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.c1 COLLATE latin1_swedish_ci=t2.c1; We get the result MariaDB [test]> SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.c1 COLLATE latin1_swedish_ci=t2.c1; +----+ | c1 | +----+ | a | | a | +----+ Now let's add more rows: INSERT INTO t2 VALUES ( 'b' ),( 'B' ),( 'c' ),( 'C' ),( 'd' ),( 'D' ),( 'e' ),( 'E' ); The query returns the same result set as expected. Now let's add the primary key as in your test case ALTER TABLE t2 ADD PRIMARY KEY (c1); Now the query returns a wrong result set MariaDB [test]> SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.c1 COLLATE latin n1_swedish_ci=t2.c1; +----+ | c1 | +----+ | a | +----+ Note that now the query uses a plan different from that used in your test case after addition of the primary key. This plan employs a range index scan rather then a full index scan as in your test case. MariaDB [test]> EXPLAIN SELECT t1.* FROM t1 LEFT JOIN t2 ON t1.c1 COLLATE latin1_swedish_ci=t2.c1; +------+-------------+-------+--------+---------------+---------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+---------------+---------+---------+------+------+--------------------------+ | 1 | SIMPLE | t1 | system | NULL | NULL | NULL | NULL | 1 | | | 1 | SIMPLE | t2 | range | PRIMARY | PRIMARY | 1 | NULL | 1 | Using where; Using index | +------+-------------+-------+--------+---------------+---------+---------+------+------+--------------------------+ I reproduced the problem in the current 10.4 and 10.1. All other versions most probably are also affected.
            igor Igor Babaev (Inactive) made changes -
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Stalled [ 10000 ]
            igor Igor Babaev (Inactive) made changes -
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.0.15 [ 17300 ]
            igor Igor Babaev (Inactive) made changes -
            Affects Version/s 10.5.1 [ 24029 ]
            Affects Version/s 10.2.29 [ 23911 ]
            Affects Version/s 10.4.12 [ 24019 ]
            Affects Version/s 10.3.22 [ 24018 ]
            Affects Version/s 10.1.44 [ 23912 ]
            Affects Version/s 10.0.14 [ 17101 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.1 [ 16100 ]
            serg Sergei Golubchik made changes -
            Assignee Alexander Barkov [ bar ] Michael Widenius [ monty ]
            serg Sergei Golubchik made changes -
            Assignee Michael Widenius [ monty ] Alexander Barkov [ bar ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.0.15 [ 17300 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            serg Sergei Golubchik made changes -
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]

            closed again. new bug is moved to MDEV-26129

            serg Sergei Golubchik added a comment - closed again. new bug is moved to MDEV-26129
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 65322 ] MariaDB v4 [ 148379 ]

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.