Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4103

Columnstore LDI doesn't support charsets, particularly utf8mb4

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 1.5.2
    • 5.5.1
    • MDB Plugin
    • None

    Description

      create table chineseCS (english varchar(20), chinese varchar(20)) engine=innodb DEFAULT CHARACTER SET utf8mb4 collate utf8mb4_unicode_520_ci engine=columnstore;

      load data infile "/home/calpont/t.tbl" into table chineseCS character set utf8mb4 fields terminated by "|";

      select * from chineseCS;
      ------------------+

      english chinese

      ------------------+

      Abhorrent NULL
      Adamant NULL
      Agony NULL
      Antsy NULL
      Appall NULL

      ------------------+
      5 rows in set (0.146 sec)

      But using traditional insert, it works:

      truncate table chineseCS;
      insert into chineseCS values ("Abhorrent","可惡的"),("Adamant","精金"),("Agony","痛苦"),("Antsy","螞蟻"),("Appall","驚恐");

      select * from chineseCS;
      --------------------+

      english chinese

      --------------------+

      Abhorrent 可惡的
      Adamant 精金
      Agony 痛苦
      Antsy 螞蟻
      Appall 驚恐

      --------------------+
      5 rows in set (0.157 sec)

      Attachments

        Issue Links

          Activity

            Note, MariaDB 10.5 has a default utf8mb4 character set. All I did was create the table and it used the default character set, a character set that is apparently not supported. That seems less than ideal to me, and it violates the policy of least surprise.

            greenlion Justin Swanhart added a comment - Note, MariaDB 10.5 has a default utf8mb4 character set. All I did was create the table and it used the default character set, a character set that is apparently not supported. That seems less than ideal to me, and it violates the policy of least surprise.

            create a table with TINYINT UNSIGNED:
            create table oltp.innodb_table
            (c1 tinyint unsigned)
            engine=innodb;

            replicate from Innodb -> columnstore for HTAP using rewrite rules.

            The columnstore table should be:
            create table olap.innodb_table
            (c1 tinyint unsigned)
            engine=columnstore;

            insert into oltp.innodb_table values (255);

            The values won't be able to be inserted into ColumnStore. 255 is out of range for TINYINT UNSIGNED on columnstore, but is valid for InnoDB.

            MariaDB [test2]> insert into olap.innodb_table values(255);
            ERROR 1264 (22003): CAL0001: IDB-2025: Data truncated for column 'c1'

            greenlion Justin Swanhart added a comment - create a table with TINYINT UNSIGNED: create table oltp.innodb_table (c1 tinyint unsigned) engine=innodb; replicate from Innodb -> columnstore for HTAP using rewrite rules. The columnstore table should be: create table olap.innodb_table (c1 tinyint unsigned) engine=columnstore; insert into oltp.innodb_table values (255); The values won't be able to be inserted into ColumnStore. 255 is out of range for TINYINT UNSIGNED on columnstore, but is valid for InnoDB. MariaDB [test2] > insert into olap.innodb_table values(255); ERROR 1264 (22003): CAL0001: IDB-2025: Data truncated for column 'c1'

            Note that yes, one could change TINYINT UNSIGNED to SMALLINT on CS - but that won't work for unsupported values for BIGINT.

            greenlion Justin Swanhart added a comment - Note that yes, one could change TINYINT UNSIGNED to SMALLINT on CS - but that won't work for unsupported values for BIGINT.
            toddstoffel Todd Stoffel (Inactive) added a comment - - edited

            This is true, there are limitation to Columnstore field types and functions. We do not plan to make this a replacement for InnoDB. On certain occasions like you described above you might need to map one field type to another. Indexes would need to removed as well. These are known and documented differences between storage engines. We will try to close the gaps where appropriate (We have a few tickets related to this work MCOL-269, MCOL-641).

            toddstoffel Todd Stoffel (Inactive) added a comment - - edited This is true, there are limitation to Columnstore field types and functions. We do not plan to make this a replacement for InnoDB. On certain occasions like you described above you might need to map one field type to another. Indexes would need to removed as well. These are known and documented differences between storage engines. We will try to close the gaps where appropriate (We have a few tickets related to this work MCOL-269 , MCOL-641 ).

            This works now that MCOL-2000 is fixed

            David.Hall David Hall (Inactive) added a comment - This works now that MCOL-2000 is fixed

            People

              David.Hall David Hall (Inactive)
              David.Hall David Hall (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.