Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4005

mishandling multi-byte chars in DML export to cpimport

Details

    • Bug
    • Status: Closed (View Workflow)
    • Blocker
    • Resolution: Fixed
    • 1.4.3
    • 1.4.4
    • MDB Plugin
    • None
    • 2020-7

    Description

      Investigating a crash in Sky, I found a large problem in our code that exports data to cpimport (LDI). In many places we're assuming chars are 1 byte, and mixing things like byte count and character count.

      The function is ha_mcs_impl_write_batch_row_() in ha_mcs_dml.cpp. It'll be obvious when you look at it.

      Attachments

        Issue Links

          Activity

            For QA: With this issue, we are fixing a crash while performing an LDI in a table which has the character set utf8 at the table level. Here are steps to reproduce:

            MariaDB [test]> create table dummy (a varchar(255), b text(255), c varchar(255)) default character set utf8 engine=columnstore;
            Query OK, 0 rows affected (0.308 sec)
             
            MariaDB [test]> load data local infile './mcol4005.txt' into table dummy fields enclosed by '"' terminated by ',';
            ERROR 2013 (HY000): Lost connection to MySQL server during query
            MariaDB [test]> 
            

            Here, mcol4005.txt contains the following

            tntnatbry@tntnatbry:~/git-projects/server/storage/columnstore$ cat mcol4005.txt
            "field1","\\","field3"
            "field1",\N,"field3"
            "field1","field2","field3"
            

            In addition, if we remove the character set property from the table creation, the LDI works, but it does not insert the "\" in the data file properly. We are also fixing this. Here are steps to reproduce:

            MariaDB [test]> create table dummy (a varchar(255), b text(255), c varchar(255)) engine=columnstore;
            Query OK, 0 rows affected (0.273 sec)
             
            MariaDB [test]> load data local infile './mcol4005.txt' into table dummy fields enclosed by '"' terminate
            d by ',';
            Query OK, 3 rows affected (1.321 sec)
            Records: 3  Deleted: 0  Skipped: 0  Warnings: 0
             
            MariaDB [test]> select * from dummy;
            +--------+--------+--------+
            | a      | b      | c      |
            +--------+--------+--------+
            | field1 |      | NULL   |
            | field1 | NULL   | field3 |
            | field1 | field2 | field3 |
            +--------+--------+--------+
            3 rows in set (0.090 sec)
            

            tntnatbry Gagan Goel (Inactive) added a comment - For QA: With this issue, we are fixing a crash while performing an LDI in a table which has the character set utf8 at the table level. Here are steps to reproduce: MariaDB [test]> create table dummy (a varchar (255), b text(255), c varchar (255)) default character set utf8 engine=columnstore; Query OK, 0 rows affected (0.308 sec)   MariaDB [test]> load data local infile './mcol4005.txt' into table dummy fields enclosed by '"' terminated by ',' ; ERROR 2013 (HY000): Lost connection to MySQL server during query MariaDB [test]> Here, mcol4005.txt contains the following tntnatbry@tntnatbry:~ /git-projects/server/storage/columnstore $ cat mcol4005.txt "field1" , "\\" , "field3" "field1" ,\N, "field3" "field1" , "field2" , "field3" In addition, if we remove the character set property from the table creation, the LDI works, but it does not insert the "\" in the data file properly. We are also fixing this. Here are steps to reproduce: MariaDB [test]> create table dummy (a varchar (255), b text(255), c varchar (255)) engine=columnstore; Query OK, 0 rows affected (0.273 sec)   MariaDB [test]> load data local infile './mcol4005.txt' into table dummy fields enclosed by '"' terminate d by ',' ; Query OK, 3 rows affected (1.321 sec) Records: 3 Deleted: 0 Skipped: 0 Warnings: 0   MariaDB [test]> select * from dummy; + --------+--------+--------+ | a | b | c | + --------+--------+--------+ | field1 | | NULL | | field1 | NULL | field3 | | field1 | field2 | field3 | + --------+--------+--------+ 3 rows in set (0.090 sec)

            Build verified: 1.4.4-1 (Jenkins 20200522)

            Tested on all 3 modes of columnstore_use_import_for_batchinsert.

            dleeyh Daniel Lee (Inactive) added a comment - Build verified: 1.4.4-1 (Jenkins 20200522) Tested on all 3 modes of columnstore_use_import_for_batchinsert.

            People

              dleeyh Daniel Lee (Inactive)
              pleblanc Patrick LeBlanc (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.