[MCOL-4005] mishandling multi-byte chars in DML export to cpimport Created: 2020-05-15  Updated: 2021-04-19  Resolved: 2020-05-26

Status: Closed
Project: MariaDB ColumnStore
Component/s: MDB Plugin
Affects Version/s: 1.4.3
Fix Version/s: 1.4.4

Type: Bug Priority: Blocker
Reporter: Patrick LeBlanc (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
causes MCOL-4364 LOAD DATA crashes mariadb process Closed
Sprint: 2020-7

 Description   

Investigating a crash in Sky, I found a large problem in our code that exports data to cpimport (LDI). In many places we're assuming chars are 1 byte, and mixing things like byte count and character count.

The function is ha_mcs_impl_write_batch_row_() in ha_mcs_dml.cpp. It'll be obvious when you look at it.



 Comments   
Comment by Gagan Goel (Inactive) [ 2020-05-22 ]

For QA: With this issue, we are fixing a crash while performing an LDI in a table which has the character set utf8 at the table level. Here are steps to reproduce:

MariaDB [test]> create table dummy (a varchar(255), b text(255), c varchar(255)) default character set utf8 engine=columnstore;
Query OK, 0 rows affected (0.308 sec)
 
MariaDB [test]> load data local infile './mcol4005.txt' into table dummy fields enclosed by '"' terminated by ',';
ERROR 2013 (HY000): Lost connection to MySQL server during query
MariaDB [test]> 

Here, mcol4005.txt contains the following

tntnatbry@tntnatbry:~/git-projects/server/storage/columnstore$ cat mcol4005.txt
"field1","\\","field3"
"field1",\N,"field3"
"field1","field2","field3"

In addition, if we remove the character set property from the table creation, the LDI works, but it does not insert the "\" in the data file properly. We are also fixing this. Here are steps to reproduce:

MariaDB [test]> create table dummy (a varchar(255), b text(255), c varchar(255)) engine=columnstore;
Query OK, 0 rows affected (0.273 sec)
 
MariaDB [test]> load data local infile './mcol4005.txt' into table dummy fields enclosed by '"' terminate
d by ',';
Query OK, 3 rows affected (1.321 sec)
Records: 3  Deleted: 0  Skipped: 0  Warnings: 0
 
MariaDB [test]> select * from dummy;
+--------+--------+--------+
| a      | b      | c      |
+--------+--------+--------+
| field1 |      | NULL   |
| field1 | NULL   | field3 |
| field1 | field2 | field3 |
+--------+--------+--------+
3 rows in set (0.090 sec)

Comment by Daniel Lee (Inactive) [ 2020-05-26 ]

Build verified: 1.4.4-1 (Jenkins 20200522)

Tested on all 3 modes of columnstore_use_import_for_batchinsert.

Generated at Thu Feb 08 02:47:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.