Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.6.2, 6.2.1
-
None
-
mariadb/columnstore Docker container
-
2021-16
Description
I currently write some queries during migration to ColumnStore from another Database and stumbled upon a probable bug concerning collation.
We have columns where we use WHERE clauses where the matching has to be case sensitive. Therefore I created the database with a binary collation as default settings.
When filtering with a WHERE clause using the LIKE operator, the collation settings seem to be used. When using with IN this seem to be not the case.
Expected result: IN operator should also be case sensitive when set in the database settings.
MariaDB [dac]> CREATE DATABASE test CHARACTER SET utf8mb4 COLLATE utf8mb4_bin; |
Query OK, 1 row affected (0.028 sec)
|
|
MariaDB [dac]> USE test; |
Database changed |
MariaDB [test]> DROP TABLE IF EXISTS collation_test; |
Query OK, 0 rows affected, 1 warning (0.325 sec) |
|
MariaDB [test]>
|
MariaDB [test]> CREATE TABLE collation_test ( |
-> LETTERS TEXT NULL |
-> )
|
-> ENGINE=ColumnStore;
|
Query OK, 0 rows affected (0.878 sec) |
|
MariaDB [test]>
|
MariaDB [test]> INSERT INTO collation_test (LETTERS) VALUES ('a'), ('A'), ('b'), ('B'); |
Query OK, 4 rows affected (0.827 sec) |
Records: 4 Duplicates: 0 Warnings: 0
|
|
MariaDB [test]>
|
MariaDB [test]> SELECT * FROM collation_test; |
+---------+ |
| LETTERS |
|
+---------+ |
| a |
|
| A |
|
| b |
|
| B |
|
+---------+ |
4 rows in set (0.396 sec) |
|
MariaDB [test]>
|
MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS LIKE 'a'; |
+---------+ |
| LETTERS |
|
+---------+ |
| a |
|
+---------+ |
1 row in set (0.029 sec) |
|
MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS IN ('a'); |
+---------+ |
| LETTERS |
|
+---------+ |
| a |
|
| A |
|
+---------+ |
2 rows in set (0.047 sec) |
|
MariaDB [test]>
|
Setting the collation explicitly does not help either
MariaDB [test]> SELECT * FROM collation_test WHERE LETTERS IN ('a' COLLATE utf8mb4_bin); |
ERROR 1253 (42000): COLLATION 'utf8mb4_bin' is not valid for CHARACTER SET 'latin1' |
The setting is though set to utf8mb4_bin. (checked via SHOW TABLE STATUS LIKE 'collation_test'.
This is how it works for InnoDB:
MariaDB [test]> CREATE TABLE `collation_test` (
-> `LETTERS` text COLLATE utf8mb4_bin DEFAULT NULL
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
Query OK, 0 rows affected (1.051 sec)
MariaDB [test]> insert into collation_test values ('a'), ('A');
Query OK, 2 rows affected (0.267 sec)
Records: 2 Duplicates: 0 Warnings: 0
MariaDB [test]> select * from collation_test where LETTERS like 'a';
+---------+
| LETTERS |
+---------+
| a |
+---------+
1 row in set (0.032 sec)
MariaDB [test]> select * from collation_test where LETTERS = 'a';
+---------+
| LETTERS |
+---------+
| a |
+---------+
1 row in set (0.051 sec)
MariaDB [test]> select * from collation_test where LETTERS in ('a');
+---------+
| LETTERS |
+---------+
| a |
+---------+
1 row in set (0.002 sec)
As we have a difference in behavior here for Columnstore and I can not find any obvious explanation or logic (LIKE vs IN behavior) I'd agree that this is a bug.