[MDEV-20884] Different resultset when using prefix index and utf8mb4 Created: 2019-10-23  Updated: 2019-11-05

Status: Confirmed
Project: MariaDB Server
Component/s: Character Sets, Server
Affects Version/s: 5.5, 10.1, 10.4.4, 10.2, 10.3
Fix Version/s: 10.4

Type: Bug Priority: Minor
Reporter: Joffrey MICHAIE Assignee: Alexander Barkov
Resolution: Unresolved Votes: 3
Labels: upstream
Environment:

Linux



 Description   

Hola MariaDB Team!

I'm not sure if it's a bug or a feature.

With a column in utf8mb4, and a prefix on it, the result set is not the same depending on which index is used.
Once doing some tests, I realized that the collation may change depending if we use a prefix index or not.

Sample test:

 
# cat testcase_utf8mb4.sql
SET NAMES utf8mb4;
DROP DATABASE if EXISTS db_utf8mb4;
CREATE DATABASE db_utf8mb4 CHARSET utf8mb4;
USE db_utf8mb4;
CREATE TABLE tbl_utf8mb4 (id serial, path varchar(255), KEY(path), KEY path_prefix (path(200))) ENGINE=InnoDB ROW_FORMAT=DYNAMIC charset utf8mb4 collate utf8mb4_unicode_ci;
INSERT INTO tbl_utf8mb4 (path) VALUES (CONCAT('/',CHAR(4036994689)));
SELECT * FROM tbl_utf8mb4 WHERE path LIKE '/%' ;
SELECT * FROM tbl_utf8mb4 FORCE INDEX (path) WHERE path LIKE '/%' ;
SELECT * FROM tbl_utf8mb4 FORCE INDEX (path_prefix) WHERE path LIKE '/%' ;
SELECT * FROM tbl_utf8mb4 FORCE INDEX (path_prefix) WHERE path LIKE '/%' COLLATE utf8mb4_general_ci;

Output (emoji replaced with plaintext as Jira doesn't seem to support it)

# mysql  -v < testcase_utf8mb4.sql
--------------
SET NAMES utf8mb4
--------------
 
--------------
DROP DATABASE if EXISTS db_utf8mb4
--------------
 
--------------
CREATE DATABASE db_utf8mb4 CHARSET utf8mb4
--------------
 
--------------
CREATE TABLE tbl_utf8mb4 (id serial, path varchar(255), KEY(path), KEY path_prefix (path(200))) ENGINE=InnoDB ROW_FORMAT=DYNAMIC charset utf8mb4 collate utf8mb4_unicode_ci
--------------
 
--------------
INSERT INTO tbl_utf8mb4 (path) VALUES (CONCAT('/',CHAR(4036994689)))
--------------
 
--------------
SELECT * FROM tbl_utf8mb4 WHERE path LIKE '/%'
--------------
 
id	path
1	/(LION)
--------------
SELECT * FROM tbl_utf8mb4 FORCE INDEX (path) WHERE path LIKE '/%'
--------------
 
id	path
1	/(LION)
--------------
SELECT * FROM tbl_utf8mb4 FORCE INDEX (path_prefix) WHERE path LIKE '/%'
--------------
 
--------------
SELECT * FROM tbl_utf8mb4 FORCE INDEX (path_prefix) WHERE path LIKE '/%' COLLATE utf8mb4_general_ci
--------------
 
id	path
1	/(LION)

I could reproduce the same between MySQL 5.5 and MariaDB 10.4 (Also tested 10.3 and 10.2)

Thanks for checking!
Joffrey



 Comments   
Comment by Elena Stepanova [ 2019-11-05 ]

Thanks for the report. Reproducible as described on MariaDB 5.5-10.5, MySQL 5.7.25, 8.0.13 (didn't try other versions).

Generated at Thu Feb 08 09:02:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.