[MDEV-11343] LOAD DATA INFILE fails to load data with an escape character followed by a multi-byte character Created: 2016-11-24  Updated: 2020-08-25  Resolved: 2016-11-29

Status: Closed
Project: MariaDB Server
Component/s: Character Sets
Affects Version/s: 10.0.28
Fix Version/s: 10.0.29

Type: Bug Priority: Major
Reporter: Alexander Barkov Assignee: Alexander Barkov
Resolution: Fixed Votes: 3
Labels: upstream

Issue Links:
Relates
relates to MDEV-11217 Regression: LOAD DATA INFILE started ... Closed
relates to MDEV-11631 LOAD DATA INFILE fails to load data w... Closed
relates to MDEV-11348 LOAD DATA LOCAL INFILE crashes the se... Closed
relates to MDEV-12240 LOAD DATA INFILE binary blobs failing... Confirmed

 Description   

I create a file with a backslash followed by a multi-byte utf8 character:

echo "\ä" >/tmp/test.txt

Now I try to load this file into a table:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;

It fails with this error:

ERROR 1300 (HY000): Invalid utf8 character string: ''

Looks wrong. The expected behaviour is to put the character 'ä' into the table.

The problem is NOT repeatable in 5.5.
The problem is repeatable in 10.0.
The problem is repeatable in 10.1.
The problem is NOT repeatable in 10.2.

It seems the problem was introduced when we merged this change from MySQL:

commit 9f7288e2e0179db478d20c74f57b5c7d6c95f793
Author: Thayumanavar S <thayumanavar.x.sachithanantha@oracle.com>
Date:   Mon Jun 20 11:35:43 2016 +0530
 
    BUG#23080148 - BACKPORT BUG 14653594 AND BUG 20683959 TO MYSQL-5.5
    The bug asks for a backport of bug#1463594 and bug#20682959. This
    is required because of the fact that if replication is enabled, master
    transaction can commit whereas slave can't commit due to not exact
    'enviroment'. This manifestation is seen in bug#22024200.



 Comments   
Comment by Valerii Kravchuk [ 2016-11-24 ]

Affects upstream MySQL See http://bugs.mysql.com/bug.php?id=83950

Comment by Alexander Barkov [ 2016-11-29 ]

Approved by Sergei.

Comment by Lennart Schedin [ 2016-12-21 ]

I can reproduce this problem on MariaDB 5.5.52 (and 5.5.53):

[root@server ~]# cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
[root@server ~]# mysql --version
mysql  Ver 15.1 Distrib 5.5.52-MariaDB, for Linux (x86_64) using readline 5.1
[root@server ~]# echo "\ä" > /tmp/test.txt
[root@server ~]# mysql -u root test -e "DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8); LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;"
ERROR 1300 (HY000) at line 1: Invalid utf8 character string: ''

The problem was probably introduced in MariaDB 5.5.51 with the commit that Alexander Barkov wrote in the ticket description.

Would it be possible to backport this fix into 5.5? I understand that part of the fix is specific for MariaDB 10.0 and 10.1.

Comment by Lennart Schedin [ 2017-03-20 ]

I have communicated this to MariaDB on other channels, but I feel it is good the information is public:

At 2016-10-28 MySQL reverted the commit that I think caused problems for me in the 5.5 track: https://github.com/mysql/mysql-server/commit/c3cf7f47f0f4a1ec314001aaf0c3d9c1c1f62097

Sergei Golubchik at MariaDB merged the branch mysql/5.5 into 5.5 for MariaDB at 2016-12-22: https://github.com/MariaDB/server/commit/9fefe973360124f281122a129434a36e661168b9

MariaDB release 5.5.54 included this revert. Thus the problem is fixed for me in the 5.5 track.

Generated at Thu Feb 08 07:49:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.