[MDEV-11079] Regression: LOAD DATA INFILE lost BLOB support using utf8 load files - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 5.5.51, 10.0.27, 5.5(EOL), 10.0(EOL), 10.1(EOL)
Fix Version/s: 5.5.55, 10.0.29, 10.1.21
Component/s: Character Sets
Labels:

Sprint:
5.5.54

Description

https://github.com/mysql/mysql-server/commit/774e6ffd0897dd763763b69e15028c1fbd44c4e7 changed the way, load data infile parses the data.

The commit starts validating the whole load file using the file character set. BLOBs have always been copied 1:1 (no character set translations - only escape sequences are processed). If the load file is using UTF-8, blob columns can never be encoded in UTF-8, as binary data can contain character sequences, which are invalid UTF-8 and there is no charset conversion.

Starting with this commit, LOAD DATA rejects non-UTF8 sequences in blob fields.

Create a test file:

$ hexdump -C test

00000000  22 25 aa ab ac 22 0a                              |"%ª«¬".|

CREATE TABLE `x` ( `y` mediumblob NOT NULL) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;

load data local infile 'test'  into table x charset utf8 FIELDS TERMINATED BY ';' ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\n';

This will lead to:

ERROR 1300 (HY000): Invalid utf8 character string: '"%'

Older MariaDB 10.0.X releases can load this file.

Attachments

Issue Links

relates to

MDEV-12240 LOAD DATA INFILE binary blobs failing for UTF8

Confirmed

Activity

People

Assignee:: Sergei Golubchik

Reporter:: Martin Koegler

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 2016-10-18 16:52

Updated:: 2017-03-14 13:46

Resolved:: 2017-01-09 10:22

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server