[MCOL-4103] Columnstore LDI doesn't support charsets, particularly utf8mb4 Created: 2020-06-24 Updated: 2020-12-14 Resolved: 2020-12-14 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | MDB Plugin |
| Affects Version/s: | 1.5.2 |
| Fix Version/s: | 5.5.1 |
| Type: | Bug | Priority: | Critical |
| Reporter: | David Hall (Inactive) | Assignee: | David Hall (Inactive) |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Description |
|
create table chineseCS (english varchar(20), chinese varchar(20)) engine=innodb DEFAULT CHARACTER SET utf8mb4 collate utf8mb4_unicode_520_ci engine=columnstore; load data infile "/home/calpont/t.tbl" into table chineseCS character set utf8mb4 fields terminated by "|"; select * from chineseCS;
----------
---------- But using traditional insert, it works: truncate table chineseCS; select * from chineseCS;
----------
---------- |
| Comments |
| Comment by David Hall (Inactive) [ 2020-08-18 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
We need to get cpimport to support this before LDI can. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Todd Stoffel (Inactive) [ 2020-08-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Officially MariaDB Columnstore only supports UTF8. ALL of our documentation and requirement guides mention this. This is more an enhancement than a bug.
| |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Justin Swanhart [ 2020-08-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Well, if columnstore doesn't support anything other than UTF-8, tables with other character sets should not be able to be created. And if I have InnoDB data that I want to use with HTAP, and the InnoDB data has other character sets, how is that supposed to work. If an InnoDB table has a TINYINT UNSIGNED with values of 255, HTAP won't work either. So I think there is a real fundamental problem here. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Todd Stoffel (Inactive) [ 2020-08-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
greenlion Please provide example source and target tables and we'll be happy to address your concerns. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Justin Swanhart [ 2020-08-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Note, MariaDB 10.5 has a default utf8mb4 character set. All I did was create the table and it used the default character set, a character set that is apparently not supported. That seems less than ideal to me, and it violates the policy of least surprise. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Justin Swanhart [ 2020-08-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
create a table with TINYINT UNSIGNED: replicate from Innodb -> columnstore for HTAP using rewrite rules. The columnstore table should be: insert into oltp.innodb_table values (255); The values won't be able to be inserted into ColumnStore. 255 is out of range for TINYINT UNSIGNED on columnstore, but is valid for InnoDB. MariaDB [test2]> insert into olap.innodb_table values(255); | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Justin Swanhart [ 2020-08-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Note that yes, one could change TINYINT UNSIGNED to SMALLINT on CS - but that won't work for unsupported values for BIGINT. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Todd Stoffel (Inactive) [ 2020-08-25 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
This is true, there are limitation to Columnstore field types and functions. We do not plan to make this a replacement for InnoDB. On certain occasions like you described above you might need to map one field type to another. Indexes would need to removed as well. These are known and documented differences between storage engines. We will try to close the gaps where appropriate (We have a few tickets related to this work | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Hall (Inactive) [ 2020-12-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
This works now that |