Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.6.2, 6.2.1
-
None
-
mariadb/columnstore Docker container
-
2021-16
Description
I currently write some queries during migration to ColumnStore from another Database and stumbled upon a probable bug concerning collation.
We have columns where we use WHERE clauses where the matching has to be case sensitive. Therefore I created the database with a binary collation as default settings.
When filtering with a WHERE clause using the LIKE operator, the collation settings seem to be used. When using with IN this seem to be not the case.
Expected result: IN operator should also be case sensitive when set in the database settings.
MariaDB [dac]> CREATE DATABASE test CHARACTER SET utf8mb4 COLLATE utf8mb4_bin; |
Query OK, 1 row affected (0.028 sec)
|
|
MariaDB [dac]> USE test; |
Database changed |
MariaDB [test]> DROP TABLE IF EXISTS collation_test; |
Query OK, 0 rows affected, 1 warning (0.325 sec) |
|
MariaDB [test]>
|
MariaDB [test]> CREATE TABLE collation_test ( |
-> LETTERS TEXT NULL |
-> )
|
-> ENGINE=ColumnStore;
|
Query OK, 0 rows affected (0.878 sec) |
|
MariaDB [test]>
|
MariaDB [test]> INSERT INTO collation_test (LETTERS) VALUES ('a'), ('A'), ('b'), ('B'); |
Query OK, 4 rows affected (0.827 sec) |
Records: 4 Duplicates: 0 Warnings: 0
|
|
MariaDB [test]>
|
MariaDB [test]> SELECT * FROM collation_test; |
+---------+ |
| LETTERS |
|
+---------+ |
| a |
|
| A |
|
| b |
|
| B |
|
+---------+ |
4 rows in set (0.396 sec) |
|
MariaDB [test]>
|
MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS LIKE 'a'; |
+---------+ |
| LETTERS |
|
+---------+ |
| a |
|
+---------+ |
1 row in set (0.029 sec) |
|
MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS IN ('a'); |
+---------+ |
| LETTERS |
|
+---------+ |
| a |
|
| A |
|
+---------+ |
2 rows in set (0.047 sec) |
|
MariaDB [test]>
|
Setting the collation explicitly does not help either
MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS IN ('a' COLLATE utf8mb4_bin); |
ERROR 1253 (42000): COLLATION 'utf8mb4_bin' is not valid for CHARACTER SET 'latin1' |
The setting is though set to utf8mb4_bin. (checked via SHOW TABLE STATUS LIKE 'collation_test'.
Attachments
Activity
Field | Original Value | New Value |
---|---|---|
Description |
I currently write some queries during migration to ColumnStore from another Database and stumbled upon a probable bug concerning collation.
We have columns where we use WHERE clauses where the matching has to be case sensitive. Therefore I created the database with a binary collation as default settings. When filtering with a WHERE clause using the LIKE operator, the collation settings seem to be used. When using with IN this seem to be not the case. Expected result: IN operator should also be case sensitive when set in the database settings. {{MariaDB [dac]> CREATE DATABASE test CHARACTER SET utf8mb4 COLLATE utf8mb4_bin; Query OK, 1 row affected (0.028 sec) MariaDB [dac]> USE test; Database changed MariaDB [test]> DROP TABLE IF EXISTS collation_test; Query OK, 0 rows affected, 1 warning (0.325 sec) MariaDB [test]> MariaDB [test]> CREATE TABLE collation_test ( -> LETTERS TEXT NULL -> ) -> ENGINE=ColumnStore; Query OK, 0 rows affected (0.878 sec) MariaDB [test]> MariaDB [test]> INSERT INTO collation_test (LETTERS) VALUES ('a'), ('A'), ('b'), ('B'); Query OK, 4 rows affected (0.827 sec) Records: 4 Duplicates: 0 Warnings: 0 MariaDB [test]> MariaDB [test]> SELECT * FROM collation_test; +---------+ | LETTERS | +---------+ | a | | A | | b | | B | +---------+ 4 rows in set (0.396 sec) MariaDB [test]> MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS LIKE 'a'; +---------+ | LETTERS | +---------+ | a | +---------+ 1 row in set (0.029 sec) MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS IN ('a'); +---------+ | LETTERS | +---------+ | a | | A | +---------+ 2 rows in set (0.047 sec) MariaDB [test]>}} Setting the collation explicitly does not help either {{ MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS IN ('a' COLLATE utf8mb4_bin); ERROR 1253 (42000): COLLATION 'utf8mb4_bin' is not valid for CHARACTER SET 'latin1' }} The setting is though set to utf8mb4_bin. (checked via {{SHOW TABLE STATUS LIKE 'collation_test'}}. |
I currently write some queries during migration to ColumnStore from another Database and stumbled upon a probable bug concerning collation.
We have columns where we use WHERE clauses where the matching has to be case sensitive. Therefore I created the database with a binary collation as default settings. When filtering with a WHERE clause using the LIKE operator, the collation settings seem to be used. When using with IN this seem to be not the case. Expected result: IN operator should also be case sensitive when set in the database settings. {code:sql} MariaDB [dac]> CREATE DATABASE test CHARACTER SET utf8mb4 COLLATE utf8mb4_bin; Query OK, 1 row affected (0.028 sec) MariaDB [dac]> USE test; Database changed MariaDB [test]> DROP TABLE IF EXISTS collation_test; Query OK, 0 rows affected, 1 warning (0.325 sec) MariaDB [test]> MariaDB [test]> CREATE TABLE collation_test ( -> LETTERS TEXT NULL -> ) -> ENGINE=ColumnStore; Query OK, 0 rows affected (0.878 sec) MariaDB [test]> MariaDB [test]> INSERT INTO collation_test (LETTERS) VALUES ('a'), ('A'), ('b'), ('B'); Query OK, 4 rows affected (0.827 sec) Records: 4 Duplicates: 0 Warnings: 0 MariaDB [test]> MariaDB [test]> SELECT * FROM collation_test; +---------+ | LETTERS | +---------+ | a | | A | | b | | B | +---------+ 4 rows in set (0.396 sec) MariaDB [test]> MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS LIKE 'a'; +---------+ | LETTERS | +---------+ | a | +---------+ 1 row in set (0.029 sec) MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS IN ('a'); +---------+ | LETTERS | +---------+ | a | | A | +---------+ 2 rows in set (0.047 sec) MariaDB [test]> {code} Setting the collation explicitly does not help either {code:sql} MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS IN ('a' COLLATE utf8mb4_bin); ERROR 1253 (42000): COLLATION 'utf8mb4_bin' is not valid for CHARACTER SET 'latin1' {code} The setting is though set to utf8mb4_bin. (checked via {{SHOW TABLE STATUS LIKE 'collation_test'}}. |
Rank | Ranked higher |
Rank | Ranked higher |
Fix Version/s | Icebox [ 22302 ] |
Assignee | Roman [ drrtuy ] |
Affects Version/s | 6.2.1 [ 26035 ] |
Fix Version/s | 6.2.3 [ 27102 ] | |
Fix Version/s | Icebox [ 22302 ] |
Sprint | 2021-16 [ 598 ] |
Rank | Ranked lower |
Status | Open [ 1 ] | In Progress [ 3 ] |
Summary | Collation setting of database is used for LIKE operation but not IN operation | IN operation doesn't follow COLLATE |
Assignee | Roman [ drrtuy ] | Denis Khalikov [ JIRAUSER48434 ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Assignee | Denis Khalikov [ JIRAUSER48434 ] | Gagan Goel [ tntnatbry ] |
Status | In Review [ 10002 ] | In Testing [ 10301 ] |
Assignee | Gagan Goel [ tntnatbry ] | Daniel Lee [ dleeyh ] |
Resolution | Fixed [ 1 ] | |
Status | In Testing [ 10301 ] | Closed [ 6 ] |
Zendesk Related Tickets | 123906 |
This is how it works for InnoDB:
MariaDB [test]> CREATE TABLE `collation_test` (
-> `LETTERS` text COLLATE utf8mb4_bin DEFAULT NULL
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
Query OK, 0 rows affected (1.051 sec)
MariaDB [test]> insert into collation_test values ('a'), ('A');
Query OK, 2 rows affected (0.267 sec)
Records: 2 Duplicates: 0 Warnings: 0
MariaDB [test]> select * from collation_test where LETTERS like 'a';
+---------+
| LETTERS |
+---------+
| a |
+---------+
1 row in set (0.032 sec)
MariaDB [test]> select * from collation_test where LETTERS = 'a';
+---------+
| LETTERS |
+---------+
| a |
+---------+
1 row in set (0.051 sec)
MariaDB [test]> select * from collation_test where LETTERS in ('a');
+---------+
| LETTERS |
+---------+
| a |
+---------+
1 row in set (0.002 sec)
As we have a difference in behavior here for Columnstore and I can not find any obvious explanation or logic (LIKE vs IN behavior) I'd agree that this is a bug.