[CONJ-406] MariaDB Driver (mariadb-java-client-1.4.6.jar) got an issue when inserting chinese string Created: 2017-01-04  Updated: 2017-01-25  Resolved: 2017-01-25

Status: Closed
Project: MariaDB Connector/J
Component/s: Other
Affects Version/s: 1.4.6
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Thinh Pham Assignee: Diego Dupin
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

RHEL


Attachments: Zip Archive Unicode_Issue.zip     File my.cnf    

 Description   

Hi Team,
I got an issue when insert Chinese String into table. all Chinese string will be changed to ? after inserting.
Ex: Create a table UTF8Test by following command:
CREATE TABLE UTF8Test (Driver VARCHAR(256),UTF8colunm VARCHAR(256) NULL DEFAULT NULL COLLATE 'utf8_general_ci') COLLATE='latin1_swedish_ci'
Then insert Chinese String into this table using MariaDB driver (mariadb-java-client-1.4.6.jar).
Insert into UTF8Test values ('driver type', '中文不工作中文不工作');
The String '中文不工作中文不工作' will be changed to '?????????' after inserting.
There is no problem when i use MYSQL driver (mysql-connector-java-5.0.8-bin.jar).
I also used "useUnicode=true&characterEncoding=UTF-8" on conection string, but cannot fix this issue.

Attachment is my example code and my database config file. Could you help to fix this issue.

One more thing, I take a check on MariaDB Connector/ODBC driver and got the same issue.

Thanks and Best Regards,
Thinh Pham



 Comments   
Comment by Thinh Pham [ 2017-01-12 ]

Thanks Diego and when will this version release? I have checked and the latest version is 1.5.6?

Thanks you so much,
Thinh Pham

Comment by Diego Dupin [ 2017-01-12 ]

Hi thinh,

MariaDB Driver exchanges with server are using exclusively UTF-8.
So on connection, character_set_client, character_set_connection and character_set_results are set to utf8 (or utf8mb4).
in your config file
init-connect = 'SET NAMES latin1'
init-connect = 'SET collation_connection = latin1_swedish_ci'
may cause this according to your server / version

I've execute your jar, and results where ok.

And the best would be to send results of query "show variables like '%character%'" to understand:

And can you indicate more information :

  • java version and vendor (oracle / openjdk / IBM)
  • server version (MariaDB / MySQL ?)
  • Do you use some proxy like Maxscale ?
Comment by Thinh Pham [ 2017-01-20 ]

Thanks Diego,
This issue is fixed if we use:
init-connect = 'SET NAMES utf8'
init-connect = 'SET collation_connection = utf8_general_ci'
but we just only want to set for a specific connection in stead of global variable.
below is the result of query "show variables like '%character%'":
character_set_client utf8
character_set_connection latin1
character_set_database latin1
character_set_filesystem binary
character_set_results utf8
character_set_server latin1
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/

Java version: openjdk version "1.8.0_111"
OS: Red Hat Enterprise Linux Server release 7.2 (Maipo)
Server version: 10.1.18-MariaDB

Are there any option or property to set for a specific connection to fix this issue?

Thanks,
Thinh Pham

Comment by Diego Dupin [ 2017-01-20 ]

character_set_connection latin1 ? that doesn't seem normal.

I would get ride of those "init_connect" :

  • very ancien, (i don't even know what will be done when setting 2 of them).
  • those init_connect are not even executed for super-users.
  • I don't know when on server this will be set before/after driver connection information (may override the driver informations).

The good way to set that is set charset information by dedicated variables :

_extract from 10.1 default configuration file _

[client]
# Default is Latin1, if you need UTF-8 set this (also in server section)
#default-character-set = utf8
 
[mysqld]
#
# * Character sets
#
# Default is Latin1, if you need UTF-8 set all this (also in client section)
#
#character-set-server  = utf8
#collation-server      = utf8_general_ci

you normally don't have to set them at all, since latin 1 is default.

After that, normally all must be allright, but If you want to change default information, you can use option "sessionVariables".
you can set charset to UTF8 using the following connection string :
"jdbc:mariadb://localhost/db?user=xx&password=yy&sessionVariables=character_set_client=utf8,character_set_results=utf8,character_set_connection=utf8,collation_connection=utf8_general_ci"

Comment by Thinh Pham [ 2017-01-23 ]

Thank Diego so much!!
I have fixed this issue by adding "sessionVariables=character_set_client=utf8,character_set_results=utf8,character_set_connection=utf8,collation_connection=utf8_general_ci" to connection string.
and i have a small question, is this the same solution for ODBC driver?

Thanks,
Thinh Pham

Comment by Diego Dupin [ 2017-01-25 ]

For ODBC, better to ask in https://jira.mariadb.org/projects/ODBC/, or support directly

Generated at Thu Feb 08 03:15:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.