Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23471

LOAD DATA: partial utf8 Sequence in binary data may "swallow" field separator

    XMLWordPrintable

Details

    Description

      LOAD DATA has special treatment for binary columns like BLOB or VARBINARY so that these are excluded from character set conversions regardless of the CHARACTER SET setting for LOAD DATA.

      Starting with MariaDB 10.2 though, when using CHARACTER SET UTF8 though, the column delimiter detection can be fooled by a byte with its highest bit set near the end of a BLOB or VARBINARY column, treating that as the beginning of an UTF sequence, and interpreting the actual delimiter character as the 2nd, 3rd (or in case of utf8mb4 4th) byte in the sequence, so skipping over it and ending up with too little columns due to the missed delimiter, and an error message like:

      ERROR 1261 (01000) at line 18: Row 1 doesn't contain data for all columns

      How to reproduce:

      CREATE DATABASE IF NOT EXISTS test;
      USE test;
       
      DROP TABLE IF EXISTS t1;
       
      CREATE TABLE t1(
        b1 VARBINARY(16),
        i1 INT
      ) DEFAULT CHARSET=utf8;
       
      INSERT INTO t1 VALUES(HEX("00"), 23);
      INSERT INTO t1 VALUES(HEX("E0"), 42);
       
      SELECT * FROM t1 INTO OUTFILE 'data.txt';
       
      TRUNCATE TABLE t1;
       
      LOAD DATA INFILE 'data.txt'
           INTO TABLE t1
           CHARACTER SET utf8
           FIELDS TERMINATED BY ';' ;
       
      SELECT HEX(b1), i1 FROM t1;
      

      Expected result:

      +---------+------+
      | HEX(b1) | i1   |
      +---------+------+
      | 00      |   23 |
      | E0      |   42 |
      +---------+------+
      

      Actual result:

      ERROR 1261 (01000): Row 2 doesn't contain data for all columns
       
      Empty set (0.001 sec)
      

      Attachments

        Issue Links

          Activity

            People

              rucha174 Rucha Deodhar
              hholzgra Hartmut Holzgraefe
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.