[MCOL-3400]  Substring text comparison is incorrect result Created: 2019-07-11  Updated: 2019-08-22  Resolved: 2019-08-22

Status: Closed
Project: MariaDB ColumnStore
Component/s: ?
Affects Version/s: 1.2.3
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Hyun Young Hun Assignee: David Hall (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

Centos6


Issue Links:
Blocks
is blocked by MCOL-3419 Changing systemlang corrupts extentma... Closed
Sprint: 2019-06

 Description   

Hi. I am using mariadb columnstore and I have a problem with text comparison.

The tables using innodb use utf8mb4 for character set because of emoticon,
and the table using columnstore is set to use utf8.

So my.cnf file is set as below.

[client]
default-character-set = utf8mb4

[mysqld]
skip-character-set-client-handshake
collation-server = utf8mb4_unicode_ci
character-set-server = utf8mb4

and columnstore.xml file also change the setting
<SystemLang>en_US.utf8</SystemLang>

and then create table and test

------ create table
create table test.test1 (
seq int,
name varchar(20)
) engine=innodb default charset=utf8;

create table test.test2 (
seq int
) engine=columnstore default charset=utf8;

insert into test.test1 values(1,'사과나무');
insert into test.test1 values(1,'포도나무');
insert into test.test2 values(1);

----- use single table select then correct result

select seq,name
from test.test1
where 1=1
and substr(name,1,2) = '포도';

result)
seq name
1 포도나무

----- but innodb + columnstore table join then incorrect result
select T1.seq,name
from test.test1 T1
, test.test2 T2
where 1=1
and T1.seq = T2.seq
and substr(name,1,2) = '포도';

result)
seq name
1 사과나무
1 포도나무

I want to search for '포도'(grapes), but it also includes the result of '사과'(apple)

Is there any way I can solve it?



 Comments   
Comment by David Hall (Inactive) [ 2019-08-13 ]

Try changing SystemLang in the Columnstore.xml:

<SystemConfig>
<SystemLang>ko_KR.UTF-8</SystemLang>

Please post back if this works or not. It is working in 1.2.4.

Comment by Hyun Young Hun [ 2019-08-20 ]

changing systemLang to "ko_KR.utf8" instead of "ko_KR.UTF-8"

then query result correct !

Comment by David Hall (Inactive) [ 2019-08-22 ]

It appears that not every operating system uses the same language strings. If the correct string for a given operating system isn't used, then collation is compromised.

Generated at Thu Feb 08 02:42:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.