Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Maxscale actually send latin1 charset to client.
Depending on client (driver) implementation, that may create encoding issues.
test case :
example :
try (Connection connection = DriverManager.getConnection("jdbc:mariadb://192.168.1.154:4006/testj?user=diego&password=diego")) { |
Statement stmt = connection.createStatement();
|
stmt.execute("drop table if exists unicodeTestChar"); |
stmt.execute("create table unicodeTestChar (id int unsigned, field1 varchar(4) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci) DEFAULT CHARSET=utf8mb4"); |
|
String emoji = "\uD83C\uDF1F"; // 4 bytes character : star |
|
try (PreparedStatement ps = connection.prepareStatement("INSERT INTO unicodeTestChar (id, field1) VALUES (1, ?)")) { |
ps.setString(1, emoji); |
ps.execute();
|
}
|
|
ResultSet rs = stmt.executeQuery("SELECT field1 FROM unicodeTestChar"); |
rs.next();
|
|
System.out.println("initial : " + emoji); |
System.out.println("stored in DB : " + rs.getString(1)); |
}
|
Using java connector (MySQL or MariaDB) will throw an exception using maxscale : "java.sql.SQLDataException: Incorrect string value: '\xF0\x9F\x8C\x9F' for column 'field1' at row 1
Query is: INSERT INTO unicodeTestChar (id, field1) VALUES (1, ?), parameters ['-star-']"
Issue Description :
Server has to know the charset client is using :
C/C will send the charset defined with character_set_client.
C/J always use utf8 (that permit to have optimization)
PHP pdo seems to work the same way than C/J
So in Initial Handshake Packet server will indicate his default
charset.
Client will then send the encoding it use in Hanshake response packet
C/J will send utf8 (33) or utf8mb4(the value send by server) according to what server send initially.
problem is masxcale always send a value "8" corresponding to latin1
charset.
so C/J will send UTF8(33), even if server is configured to use utf8mb4.
That will cause problems afterwhile, because server check that data are correct according to client charset. when client send a 4 byte utf8 character, server will throw an exception : " Incorrect string value".