[MDEV-12577] Inserting UTF8 Chinese characters produce ??? question marks Created: 2017-04-24  Updated: 2017-11-05  Resolved: 2017-11-05

Status: Closed
Project: MariaDB Server
Component/s: Character Sets
Affects Version/s: 10.2.5
Fix Version/s: 10.2.8

Type: Bug Priority: Major
Reporter: FLAESCH Sebastien Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Environment:

Linux Debian 8.6



 Description   

When using UTF-8 encoding in the client and server/database, seems that Chinese characters are inserted as ? question marks, even if the database was created with UTF8 encoding.



 Comments   
Comment by FLAESCH Sebastien [ 2017-04-24 ]

Here is the server log (produced by our C program (in fact it calls a child program this is why you see connection ids 8 and 9!)):

/opt3/dbs/mdb/10.2/bin/mysqld, Version: 10.2.5-MariaDB-log (MariaDB Server). started with:
Tcp port: 3309  Unix socket: /opt3/dbs/mdb/10.2/data/mysqld.sock
Time                 Id Command    Argument
170424 17:25:37     8 Connect   mdbuser@localhost as anonymous on test1
                    8 Query     SELECT @@version
                    8 Query     SET autocommit=1
                    8 Prepare   set character_set_client = 'utf8'
                    8 Execute   set character_set_client = 'utf8'
                    9 Connect   mdbuser@localhost as anonymous on test1
                    9 Query     SELECT @@version
                    9 Query     SET autocommit=1
                    9 Prepare   set character_set_client = 'utf8'
                    9 Execute   set character_set_client = 'utf8'
                    9 Close stmt
                    9 Prepare   DROP TABLE tutf8_é日
                    9 Execute   DROP TABLE tutf8_é日
                    9 Close stmt
                    9 Prepare   CREATE TABLE tutf8_é日 (pk integer NOT NULL PRIMARY KEY,c1_é日 char(10),vc1_é日 varchar(10))
                    9 Execute   CREATE TABLE tutf8_é日 (pk integer NOT NULL PRIMARY KEY,c1_é日 char(10),vc1_é日 varchar(10))
                    9 Close stmt
                    9 Prepare   insert into tutf8_é日 VALUES (?,?,?)
                    9 Execute   insert into tutf8_é日 VALUES (1,'1234567890','1234567890')
                    9 Close stmt
                    9 Prepare   insert into tutf8_é日 VALUES (?,?,?)
                    9 Execute   insert into tutf8_é日 VALUES (2,'አማር+','አማር+')
                    9 Close stmt
                    9 Prepare   insert into tutf8_é日 VALUES (?,?,?)
                    9 Execute   insert into tutf8_é日 VALUES (3,'âãäåçèéêëô','âãäåçèéêëô')
                    9 Close stmt
                    9 Prepare   insert into tutf8_é日 VALUES (?,?,?)
                    9 Execute   insert into tutf8_é日 VALUES (4,'日a本語bαδπአé','日a本語bαδπአé')
                    9 Close stmt
                    9 Prepare   update tutf8_é日 SET c1_é日 = 'አማር+',vc1_é日 = 'አማር+' WHERE pk = ?
                    9 Execute   update tutf8_é日 SET c1_é日 = 'አማር+',vc1_é日 = 'አማር+' WHERE pk = 2
                    9 Prepare   update tutf8_é日 SET c1_é日=?, vc1_é日=? WHERE pk = ?
                    9 Execute   update tutf8_é日 SET c1_é日='日a本語bαδπአé', vc1_é日='日a本語bαδπአé' WHERE pk = 4
                    9 Close stmt
                    9 Prepare   select tutf8_é日.* from   tutf8_é日 where pk = ?
                    9 Execute   select tutf8_é日.* from   tutf8_é日 where pk = 1
                    9 Close stmt
                    9 Prepare   select tutf8_é日.* from   tutf8_é日 where c1_é日 = ? AND vc1_é日 = ?
                    9 Execute   select tutf8_é日.* from   tutf8_é日 where c1_é日 = '1234567890' AND vc1_é日 = '1234567890'
                    9 Close stmt
                    9 Prepare   select tutf8_é日.* from   tutf8_é日 where pk = ?
                    9 Execute   select tutf8_é日.* from   tutf8_é日 where pk = 2
                    9 Close stmt
                    9 Close stmt
                    9 Quit
                    8 Close stmt
                    8 Quit

When connecting with mysql client to see what the table contains, we see latin1 characters are properly inserted by others are converted to ?:

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.2.5-MariaDB MariaDB Server
 
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test1]> set character_set_client='utf8';
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test1]> show variables like "character_set_database";
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| character_set_database | utf8  |
+------------------------+-------+
1 row in set (0.00 sec)
 
MariaDB [test1]> select * from tutf8_é日;
+----+----------------------+----------------------+
| pk | c1_é日               | vc1_é日              |
+----+----------------------+----------------------+
|  1 | 1234567890           | 1234567890           |
|  2 | ???+                 | ???+                 |
|  3 | âãäåçèéêëô           | âãäåçèéêëô           |
|  4 | ?a??b????é           | ?a??b????é           |
+----+----------------------+----------------------+
4 rows in set (0.00 sec)
 
MariaDB [test1]> SELECT CHARSET('a'), @@character_set_connection;
+--------------+----------------------------+
| CHARSET('a') | @@character_set_connection |
+--------------+----------------------------+
| utf8         | utf8                       |
+--------------+----------------------------+
1 row in set (0.00 sec)

We wonder because with the same code, Oracle MySQL works as expected.

Comment by FLAESCH Sebastien [ 2017-05-23 ]

Maybe the problem came from that fact that the .my.cnf file was not read (CONC-251)?
This seems to be fixed in 10.2.6, can be closed as invalid...?

Comment by FLAESCH Sebastien [ 2017-09-26 ]

Verified with 10.2.8, UTF-8 client setting works so this issue can be marked as invalid.

Comment by Elena Stepanova [ 2017-11-05 ]

Closing according to the comments above.

Generated at Thu Feb 08 07:58:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.