Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11217

Regression: LOAD DATA INFILE started to fail with an error

Details

    Description

      I create a file with a UTF8MB4 string:

      SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
      

      where 0xF09F988E is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

      Now I load this file:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
      LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
      SHOW WARNINGS;
      SELECT * FROM t1;
      

      Notice, the CHARACTER SET utf8 clause is wrong. It should be CHARACTER SET utf8mb4 instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.28 it fails with an error.

      Results in 10.0.24:

      MariaDB [test]> DROP TABLE IF EXISTS t1;
      Query OK, 0 rows affected (0.01 sec)
       
      MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
      Query OK, 0 rows affected (0.01 sec)
       
      MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
      Query OK, 1 row affected, 1 warning (0.00 sec)       
      Records: 1  Deleted: 0  Skipped: 0  Warnings: 1
       
      MariaDB [test]> SHOW WARNINGS;
      +---------+------+-------------------------------------------------------------------------+
      | Level   | Code | Message                                                                 |
      +---------+------+-------------------------------------------------------------------------+
      | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
      +---------+------+-------------------------------------------------------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> SELECT * FROM t1;
      +------+
      | a    |
      +------+
      | aaa  |
      +------+
      

      Results in 10.0.28:

      MariaDB [test]> DROP TABLE IF EXISTS t1;
      Query OK, 0 rows affected (0.02 sec)
       
      MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
      Query OK, 0 rows affected (0.05 sec)
       
      MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
      ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
      MariaDB [test]> SHOW WARNINGS;                       
      +-------+------+--------------------------------------+
      | Level | Code | Message                              |
      +-------+------+--------------------------------------+
      | Error | 1300 | Invalid utf8 character string: 'aaa' |
      +-------+------+--------------------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> SELECT * FROM t1;
      Empty set (0.00 sec)
      

      As a result, replication from a 10.0.24 master to a 10.0.28 slave stops with an error.

      The bug was likely introduced after merging this commit from MySQL:

      commit 9f7288e2e0179db478d20c74f57b5c7d6c95f793
      Author: Thayumanavar S <thayumanavar.x.sachithanantha@oracle.com>
      Date:   Mon Jun 20 11:35:43 2016 +0530
       
          BUG#23080148 - BACKPORT BUG 14653594 AND BUG 20683959 TO
                         MYSQL-5.5
          
          The bug asks for a backport of bug#1463594 and bug#20682959. This
          is required because of the fact that if replication is enabled, master
          transaction can commit whereas slave can't commit due to not exact
          'enviroment'. This manifestation is seen in bug#22024200.
      

      Attachments

        Issue Links

          Activity

            bar Alexander Barkov created issue -
            bar Alexander Barkov made changes -
            Field Original Value New Value
            Description I create a file with a UTF8MB4 string:
            {code:sql}
            SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
            {code}
            where {{0xF09F988E}} is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

            Now I load this file:
            {code:sql}
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            SHOW WARNINGS;
            SELECT * FROM t1;
            {code}

            Notice, the {{CHARACTER SET utf8}} clause is wrong. It should be {{CHARACTER SET utf8mb4}} instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.26 it fails with an error.

            {noformat|title=Results in 10.0.24}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            Query OK, 1 row affected, 1 warning (0.00 sec)
            Records: 1 Deleted: 0 Skipped: 0 Warnings: 1

            MariaDB [test]> SHOW WARNINGS;
            +---------+------+-------------------------------------------------------------------------+
            | Level | Code | Message |
            +---------+------+-------------------------------------------------------------------------+
            | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
            +---------+------+-------------------------------------------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            +------+
            | a |
            +------+
            | aaa |
            +------+
            {noformat}

            {noformat:Title=Results in 10.0.28}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.02 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.05 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
            MariaDB [test]> SHOW WARNINGS;
            +-------+------+--------------------------------------+
            | Level | Code | Message |
            +-------+------+--------------------------------------+
            | Error | 1300 | Invalid utf8 character string: 'aaa' |
            +-------+------+--------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            Empty set (0.00 sec)
            {noformat}

            As a result, replicating from a 10.0.24 master to a 10.0.28 slave stops replication.

            I create a file with a UTF8MB4 string:
            {code:sql}
            SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
            {code}
            where {{0xF09F988E}} is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

            Now I load this file:
            {code:sql}
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            SHOW WARNINGS;
            SELECT * FROM t1;
            {code}

            Notice, the {{CHARACTER SET utf8}} clause is wrong. It should be {{CHARACTER SET utf8mb4}} instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.26 it fails with an error.

            h3. Results in 10.0.24:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            Query OK, 1 row affected, 1 warning (0.00 sec)
            Records: 1 Deleted: 0 Skipped: 0 Warnings: 1

            MariaDB [test]> SHOW WARNINGS;
            +---------+------+-------------------------------------------------------------------------+
            | Level | Code | Message |
            +---------+------+-------------------------------------------------------------------------+
            | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
            +---------+------+-------------------------------------------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            +------+
            | a |
            +------+
            | aaa |
            +------+
            {noformat}

            h3. Results in 10.0.28:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.02 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.05 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
            MariaDB [test]> SHOW WARNINGS;
            +-------+------+--------------------------------------+
            | Level | Code | Message |
            +-------+------+--------------------------------------+
            | Error | 1300 | Invalid utf8 character string: 'aaa' |
            +-------+------+--------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            Empty set (0.00 sec)
            {noformat}

            As a result, replicating from a 10.0.24 master to a 10.0.28 slave stops replication.

            bar Alexander Barkov made changes -
            Description I create a file with a UTF8MB4 string:
            {code:sql}
            SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
            {code}
            where {{0xF09F988E}} is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

            Now I load this file:
            {code:sql}
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            SHOW WARNINGS;
            SELECT * FROM t1;
            {code}

            Notice, the {{CHARACTER SET utf8}} clause is wrong. It should be {{CHARACTER SET utf8mb4}} instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.26 it fails with an error.

            h3. Results in 10.0.24:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            Query OK, 1 row affected, 1 warning (0.00 sec)
            Records: 1 Deleted: 0 Skipped: 0 Warnings: 1

            MariaDB [test]> SHOW WARNINGS;
            +---------+------+-------------------------------------------------------------------------+
            | Level | Code | Message |
            +---------+------+-------------------------------------------------------------------------+
            | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
            +---------+------+-------------------------------------------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            +------+
            | a |
            +------+
            | aaa |
            +------+
            {noformat}

            h3. Results in 10.0.28:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.02 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.05 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
            MariaDB [test]> SHOW WARNINGS;
            +-------+------+--------------------------------------+
            | Level | Code | Message |
            +-------+------+--------------------------------------+
            | Error | 1300 | Invalid utf8 character string: 'aaa' |
            +-------+------+--------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            Empty set (0.00 sec)
            {noformat}

            As a result, replicating from a 10.0.24 master to a 10.0.28 slave stops replication.

            I create a file with a UTF8MB4 string:
            {code:sql}
            SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
            {code}
            where {{0xF09F988E}} is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

            Now I load this file:
            {code:sql}
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            SHOW WARNINGS;
            SELECT * FROM t1;
            {code}

            Notice, the {{CHARACTER SET utf8}} clause is wrong. It should be {{CHARACTER SET utf8mb4}} instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.26 it fails with an error.

            h3. Results in 10.0.24:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            Query OK, 1 row affected, 1 warning (0.00 sec)
            Records: 1 Deleted: 0 Skipped: 0 Warnings: 1

            MariaDB [test]> SHOW WARNINGS;
            +---------+------+-------------------------------------------------------------------------+
            | Level | Code | Message |
            +---------+------+-------------------------------------------------------------------------+
            | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
            +---------+------+-------------------------------------------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            +------+
            | a |
            +------+
            | aaa |
            +------+
            {noformat}

            h3. Results in 10.0.28:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.02 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.05 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
            MariaDB [test]> SHOW WARNINGS;
            +-------+------+--------------------------------------+
            | Level | Code | Message |
            +-------+------+--------------------------------------+
            | Error | 1300 | Invalid utf8 character string: 'aaa' |
            +-------+------+--------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            Empty set (0.00 sec)
            {noformat}

            As a result, replication from a 10.0.24 master to a 10.0.28 slave stops with an error.


            bar Alexander Barkov made changes -
            Description I create a file with a UTF8MB4 string:
            {code:sql}
            SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
            {code}
            where {{0xF09F988E}} is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

            Now I load this file:
            {code:sql}
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            SHOW WARNINGS;
            SELECT * FROM t1;
            {code}

            Notice, the {{CHARACTER SET utf8}} clause is wrong. It should be {{CHARACTER SET utf8mb4}} instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.26 it fails with an error.

            h3. Results in 10.0.24:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            Query OK, 1 row affected, 1 warning (0.00 sec)
            Records: 1 Deleted: 0 Skipped: 0 Warnings: 1

            MariaDB [test]> SHOW WARNINGS;
            +---------+------+-------------------------------------------------------------------------+
            | Level | Code | Message |
            +---------+------+-------------------------------------------------------------------------+
            | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
            +---------+------+-------------------------------------------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            +------+
            | a |
            +------+
            | aaa |
            +------+
            {noformat}

            h3. Results in 10.0.28:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.02 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.05 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
            MariaDB [test]> SHOW WARNINGS;
            +-------+------+--------------------------------------+
            | Level | Code | Message |
            +-------+------+--------------------------------------+
            | Error | 1300 | Invalid utf8 character string: 'aaa' |
            +-------+------+--------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            Empty set (0.00 sec)
            {noformat}

            As a result, replication from a 10.0.24 master to a 10.0.28 slave stops with an error.


            I create a file with a UTF8MB4 string:
            {code:sql}
            SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
            {code}
            where {{0xF09F988E}} is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

            Now I load this file:
            {code:sql}
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            SHOW WARNINGS;
            SELECT * FROM t1;
            {code}

            Notice, the {{CHARACTER SET utf8}} clause is wrong. It should be {{CHARACTER SET utf8mb4}} instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.26 it fails with an error.

            h3. Results in 10.0.24:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            Query OK, 1 row affected, 1 warning (0.00 sec)
            Records: 1 Deleted: 0 Skipped: 0 Warnings: 1

            MariaDB [test]> SHOW WARNINGS;
            +---------+------+-------------------------------------------------------------------------+
            | Level | Code | Message |
            +---------+------+-------------------------------------------------------------------------+
            | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
            +---------+------+-------------------------------------------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            +------+
            | a |
            +------+
            | aaa |
            +------+
            {noformat}

            h3. Results in 10.0.28:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.02 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.05 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
            MariaDB [test]> SHOW WARNINGS;
            +-------+------+--------------------------------------+
            | Level | Code | Message |
            +-------+------+--------------------------------------+
            | Error | 1300 | Invalid utf8 character string: 'aaa' |
            +-------+------+--------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            Empty set (0.00 sec)
            {noformat}

            As a result, replication from a 10.0.24 master to a 10.0.28 slave stops with an error.

            The bug was likely introduced after merging this commit from MySQL:
            {noformat}
            commit 9f7288e2e0179db478d20c74f57b5c7d6c95f793
            Author: Thayumanavar S <thayumanavar.x.sachithanantha@oracle.com>
            Date: Mon Jun 20 11:35:43 2016 +0530

                BUG#23080148 - BACKPORT BUG 14653594 AND BUG 20683959 TO
                               MYSQL-5.5
                
                The bug asks for a backport of bug#1463594 and bug#20682959. This
                is required because of the fact that if replication is enabled, master
                transaction can commit whereas slave can't commit due to not exact
                'enviroment'. This manifestation is seen in bug#22024200.
            {noformat}
            valerii Valerii Kravchuk made changes -
            Description I create a file with a UTF8MB4 string:
            {code:sql}
            SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
            {code}
            where {{0xF09F988E}} is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

            Now I load this file:
            {code:sql}
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            SHOW WARNINGS;
            SELECT * FROM t1;
            {code}

            Notice, the {{CHARACTER SET utf8}} clause is wrong. It should be {{CHARACTER SET utf8mb4}} instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.26 it fails with an error.

            h3. Results in 10.0.24:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            Query OK, 1 row affected, 1 warning (0.00 sec)
            Records: 1 Deleted: 0 Skipped: 0 Warnings: 1

            MariaDB [test]> SHOW WARNINGS;
            +---------+------+-------------------------------------------------------------------------+
            | Level | Code | Message |
            +---------+------+-------------------------------------------------------------------------+
            | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
            +---------+------+-------------------------------------------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            +------+
            | a |
            +------+
            | aaa |
            +------+
            {noformat}

            h3. Results in 10.0.28:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.02 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.05 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
            MariaDB [test]> SHOW WARNINGS;
            +-------+------+--------------------------------------+
            | Level | Code | Message |
            +-------+------+--------------------------------------+
            | Error | 1300 | Invalid utf8 character string: 'aaa' |
            +-------+------+--------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            Empty set (0.00 sec)
            {noformat}

            As a result, replication from a 10.0.24 master to a 10.0.28 slave stops with an error.

            The bug was likely introduced after merging this commit from MySQL:
            {noformat}
            commit 9f7288e2e0179db478d20c74f57b5c7d6c95f793
            Author: Thayumanavar S <thayumanavar.x.sachithanantha@oracle.com>
            Date: Mon Jun 20 11:35:43 2016 +0530

                BUG#23080148 - BACKPORT BUG 14653594 AND BUG 20683959 TO
                               MYSQL-5.5
                
                The bug asks for a backport of bug#1463594 and bug#20682959. This
                is required because of the fact that if replication is enabled, master
                transaction can commit whereas slave can't commit due to not exact
                'enviroment'. This manifestation is seen in bug#22024200.
            {noformat}
            I create a file with a UTF8MB4 string:
            {code:sql}
            SELECT CONCAT('aaa',0xF09F988E,'bbb') INTO OUTFILE '/tmp/test.txt';
            {code}
            where {{0xF09F988E}} is UTF8MB4 encoding for the character "U+1F60E SMILING FACE WITH SUNGLASSES".

            Now I load this file:
            {code:sql}
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            SHOW WARNINGS;
            SELECT * FROM t1;
            {code}

            Notice, the {{CHARACTER SET utf8}} clause is wrong. It should be {{CHARACTER SET utf8mb4}} instead. But the problem is that in 10.0.24 it loaded the data with a warning, and in 10.0.28 it fails with an error.

            h3. Results in 10.0.24:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.01 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            Query OK, 1 row affected, 1 warning (0.00 sec)
            Records: 1 Deleted: 0 Skipped: 0 Warnings: 1

            MariaDB [test]> SHOW WARNINGS;
            +---------+------+-------------------------------------------------------------------------+
            | Level | Code | Message |
            +---------+------+-------------------------------------------------------------------------+
            | Warning | 1366 | Incorrect string value: '\xF0\x9F\x98\x8Ebb...' for column 'a' at row 1 |
            +---------+------+-------------------------------------------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            +------+
            | a |
            +------+
            | aaa |
            +------+
            {noformat}

            h3. Results in 10.0.28:
            {noformat}
            MariaDB [test]> DROP TABLE IF EXISTS t1;
            Query OK, 0 rows affected (0.02 sec)

            MariaDB [test]> CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8);
            Query OK, 0 rows affected (0.05 sec)

            MariaDB [test]> LOAD DATA INFILE '/tmp/test.txt' INTO TABLE t1 CHARACTER SET utf8;
            ERROR 1300 (HY000): Invalid utf8 character string: 'aaa'
            MariaDB [test]> SHOW WARNINGS;
            +-------+------+--------------------------------------+
            | Level | Code | Message |
            +-------+------+--------------------------------------+
            | Error | 1300 | Invalid utf8 character string: 'aaa' |
            +-------+------+--------------------------------------+
            1 row in set (0.00 sec)

            MariaDB [test]> SELECT * FROM t1;
            Empty set (0.00 sec)
            {noformat}

            As a result, replication from a 10.0.24 master to a 10.0.28 slave stops with an error.

            The bug was likely introduced after merging this commit from MySQL:
            {noformat}
            commit 9f7288e2e0179db478d20c74f57b5c7d6c95f793
            Author: Thayumanavar S <thayumanavar.x.sachithanantha@oracle.com>
            Date: Mon Jun 20 11:35:43 2016 +0530

                BUG#23080148 - BACKPORT BUG 14653594 AND BUG 20683959 TO
                               MYSQL-5.5
                
                The bug asks for a backport of bug#1463594 and bug#20682959. This
                is required because of the fact that if replication is enabled, master
                transaction can commit whereas slave can't commit due to not exact
                'enviroment'. This manifestation is seen in bug#22024200.
            {noformat}
            bar Alexander Barkov made changes -
            julien.fritsch Julien Fritsch made changes -
            Labels need_feedback
            serg Sergei Golubchik made changes -
            Fix Version/s N/A [ 14700 ]
            Fix Version/s 10.0 [ 16000 ]
            Resolution Won't Fix [ 2 ]
            Status Open [ 1 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 78188 ] MariaDB v4 [ 151173 ]

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              2 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.