[MCOL-2221] reverse function not work properly with non-latin chars Created: 2019-03-08  Updated: 2020-06-23  Resolved: 2020-06-23

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 1.2.2
Fix Version/s: 1.5.2

Type: Bug Priority: Major
Reporter: Richard Stracke Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: beginner-friendly

Issue Links:
Relates
relates to MCOL-2000 varchar specified sizing is not in ch... Closed
relates to MCOL-3536 Order by with UTF Closed
Epic Link: ColumnStore Compatibility Improvements
Sprint: 2020-5, 2020-6, 2020-7

 Description   

Columnstore server is configured as UTF-8 as decribed here:
https://mariadb.com/kb/en/library/mariadb-columnstore-system-usage/#configuring-to-use-utf-8-character-sets

To reproduce:

 drop table  IF EXISTS rev;
 create table IF NOT EXISTS  rev (c1 varchar(8), c2 char(8), c3 varchar(32)) ENGINE=ColumnStore  DEFAULT CHARSET=UTF8 ;
INSERT INTO rev (c1,c2,c3) VALUES('カキクケコ','カキクケコ','カキクケコ');
INSERT INTO rev (c1,c2,c3) VALUES('ABCD','ABCD','ABCD');
INSERT INTO rev (c1,c2,c3) VALUES('45678','45678','45678');
INSERT INTO rev (c1,c2,c3) VALUES('45678','45678','45678');
select c1,reverse(c1),reverse(c2),reverse(c3) from rev;

Result is:

MariaDB [rtest]> select c1,reverse(c1),reverse(c2),reverse(c3) from rev;
+----------+-------------+-------------+-----------------+
| c1       | reverse(c1) | reverse(c2) | reverse(c3)     |
+----------+-------------+-------------+-----------------+
| カキ??   | ?㭂㫂?      | ?㭂㫂?      | ??㱂㯂㭂㫂?     |
| ABCD     | DCBA        | DCBA        | DCBA            |
| 45??   | ??        | ??        | ???         |
| 45678    | 87654       | 87654       | 87654           |
+----------+-------------+-------------+-----------------+

Expected behavioiur is like the result in innodb.

MariaDB [rtest]> select c1,reverse(c1),reverse(c2),reverse(c3) from rev;
+-----------------+-----------------+-----------------+-----------------+
| c1              | reverse(c1)     | reverse(c2)     | reverse(c3)     |
+-----------------+-----------------+-----------------+-----------------+
| カキクケコ      | コケクキカ      | コケクキカ      | コケクキカ      |
| ABCD            | DCBA            | DCBA            | DCBA            |
| 45678      | 87654      | 87654      | 87654      |
| 45678           | 87654           | 87654           | 87654           |
+-----------------+-----------------+-----------------+-----------------+



 Comments   
Comment by Rahul Anand [ 2020-02-23 ]

Hi!
I am new to MariaDB and trying to contribute. So, I'll try to fix this bug which will help me better understand the codebase.
Thanks

Comment by Roman [ 2020-03-30 ]

Greetings Rahul. Sorry for the late response. Feel free to contact me if you have any questions on this issue.

Comment by David Hall (Inactive) [ 2020-04-10 ]

This patch needs to be merged into 1.4 and the develop branch

Comment by Steven Huang [ 2020-04-13 ]

Hi David and Roman,

I was only able to build columnstore develop-1.2 while doing this ticket...the other builds kept giving compile errors no matter what engine version I was running. Do you guys have any pointers for this? I am running Ubuntu 18.04 if that's relevant.

Comment by Roman [ 2020-04-13 ]

Hey Steven,

Here are the steps to fast build current develop branch with vanilla MDB
10.5

1) Clone the repo [1] and fix paths in cs-docker-tools/shels/
run_cs_with_oam_skiped
<https://github.com/drrtuy/cs-docker-tools/blob/master/shells/run_cs_with_oam_skiped>
[2]
2) export SKIP_OAM_INIT=1 in the terminal you are going to run the next
script
3) Run cs-docker-tools/shels/run_cs_with_oam_skiped
<https://github.com/drrtuy/cs-docker-tools/blob/master/shells/run_cs_with_oam_skiped>
bionic
4) Follow the script suggestions

When the script finishes there will be MDB + MCS running in the system. You
should start/stop MCS using cs-docker-tools/shels/stopcs | startcs

1. https://github.com/drrtuy/cs-docker-tools.git
2. You should set MDB branch to mariadb-10.5.2

Regards,
Roman Nozdrin
ColumnStore Engineering
MariaDB Corporation

On Mon, Apr 13, 2020 at 8:34 PM Steven Huang (Jira) <jira@mariadb.org>

Comment by David Hall (Inactive) [ 2020-06-19 ]

This is fixed in 1.5.2, which ships with MariaDB 10.5 on June 24 2020.
The reverse portion is fixed. You see in your input column for c1 and c2 "?". This is caused by the field definition in Columnstore is rated in bytes, not characters. The byte stream is unceremoniously cut at 8 bytes, leaving an undefined bit stream at the tail. See MCOL-2000, which will not be fixed in this release.

Comment by David Hall (Inactive) [ 2020-06-22 ]

QA: You may already be testing this as part of MCOL-3536. Above test is a nice data set to try with various Function in MCOL-3536. Not that (c1 varchar(8), c2 char(8) are pretty useless here because of MCOL-2000. But the data could be used for stuff.

Comment by Daniel Lee (Inactive) [ 2020-06-23 ]

Build verified: 1.5.2-1 (community edition b33685)

The create table statement in the description needs to be modified to varcher(16) and char(16) for this test. This is because the the fields size is in bytes, not characters. This issue is being tracked in MCOL-2000.

Generated at Thu Feb 08 02:34:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.