Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL)
-
None
Description
I start a 8-bit console.
In gnome-terminal I make sure that the console character set is properly set to iso-8859-1:
Terminal -> Character Set Encoding -> Western (ISO-8859-1).
Now I run this command in shell:
$ LANG=en_US.iso88591 mysql test
|
and check that the client correctly detected the session character set as latin1:
SHOW VARIABLES LIKE 'character\_set\_%'; |
+--------------------------+--------+
|
| Variable_name | Value |
|
+--------------------------+--------+
|
| character_set_client | latin1 | <--
|
| character_set_connection | latin1 | <--
|
| character_set_database | latin1 |
|
| character_set_filesystem | binary |
|
| character_set_results | latin1 | <--
|
| character_set_server | latin1 |
|
| character_set_system | utf8 |
|
+--------------------------+--------+
|
Now I run:
SELECT HEX('ä'); |
+----------+
|
| HEX('ä') |
|
+----------+
|
| E4 |
|
+----------+
|
It correctly returns 0xE4, which is a latin1 code for "U+00E4 SMALL LETTER A WITH DIAERESIS".
Now I use a prepared statement with a user variable as a source:
SET @src='SELECT HEX(''ä'')'; |
PREPARE stmt FROM @src; EXECUTE stmt; |
+----------+
|
| HEX('ä') |
|
+----------+
|
| E4 |
|
+----------+
|
It also returns the same latin1 code. So far so good.
Now I use a prepared statement with a string literal as a source:
PREPARE stmt FROM 'SELECT HEX(''ä'')'; |
EXECUTE stmt; |
+-----------+
|
| HEX('ä') |
|
+-----------+
|
| C3A4 |
|
+-----------+
|
It returns a different result. C3A4 is a utf8 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Conversion from latin1 to utf8 happened.
Attachments
Activity
Field | Original Value | New Value |
---|---|---|
Description |
I start a 8-bit console.
In gnome-terminal I make sure that the console character set is properly set to iso-8859-1: Terminal -> Character Set Encoding -> Western (ISO-8859-1). Now I run this command in shell: {noformat} $ LANG=en_US.iso88591 mysql test {noformat} and check that the cloent correctly detected the session character set as latin1: {code:sql} SHOW VARIABLES LIKE 'character\_set\_%'; {code} {noformat} +--------------------------+--------+ | Variable_name | Value | +--------------------------+--------+ | character_set_client | latin1 | <-- | character_set_connection | latin1 | <-- | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | <-- | character_set_server | latin1 | | character_set_system | utf8 | +--------------------------+--------+ {noformat} Now I run: {code:sql} SELECT HEX('ä'); {code} {noformat} +----------+ | HEX('ä') | +----------+ | E4 | +----------+ {noformat} It correctly returns 0xE4, which is a latin1 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Now I use a prepared statement with a user variable as a source: {code:sql} SET @src='SELECT HEX(''ä'')'; PREPARE stmt FROM @src; EXECUTE stmt; {code} {noformat} +----------+ | HEX('ä') | +----------+ | E4 | +----------+ {noformat} It also returns the same latin1 code. So far so good. Now I use a prepared statement with a string literal as a source: {code:sql} PREPARE stmt FROM 'SELECT HEX(''ä'')'; EXECUTE stmt; {code} {noformat} +-----------+ | HEX('ä') | +-----------+ | C3A4 | +-----------+ {noformat} It returns a different result. C3A4 is a utf8 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Conversion from latin1 to utf8 happened. |
I start a 8-bit console.
In gnome-terminal I make sure that the console character set is properly set to iso-8859-1: Terminal -> Character Set Encoding -> Western (ISO-8859-1). Now I run this command in shell: {noformat} $ LANG=en_US.iso88591 mysql test {noformat} and check that the client correctly detected the session character set as latin1: {code:sql} SHOW VARIABLES LIKE 'character\_set\_%'; {code} {noformat} +--------------------------+--------+ | Variable_name | Value | +--------------------------+--------+ | character_set_client | latin1 | <-- | character_set_connection | latin1 | <-- | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | <-- | character_set_server | latin1 | | character_set_system | utf8 | +--------------------------+--------+ {noformat} Now I run: {code:sql} SELECT HEX('ä'); {code} {noformat} +----------+ | HEX('ä') | +----------+ | E4 | +----------+ {noformat} It correctly returns 0xE4, which is a latin1 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Now I use a prepared statement with a user variable as a source: {code:sql} SET @src='SELECT HEX(''ä'')'; PREPARE stmt FROM @src; EXECUTE stmt; {code} {noformat} +----------+ | HEX('ä') | +----------+ | E4 | +----------+ {noformat} It also returns the same latin1 code. So far so good. Now I use a prepared statement with a string literal as a source: {code:sql} PREPARE stmt FROM 'SELECT HEX(''ä'')'; EXECUTE stmt; {code} {noformat} +-----------+ | HEX('ä') | +-----------+ | C3A4 | +-----------+ {noformat} It returns a different result. C3A4 is a utf8 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Conversion from latin1 to utf8 happened. |
Status | Open [ 1 ] | In Progress [ 3 ] |
Fix Version/s | 10.3.0 [ 22127 ] | |
Fix Version/s | 10.2 [ 14601 ] | |
Resolution | Fixed [ 1 ] | |
Status | In Progress [ 3 ] | Closed [ 6 ] |
Fix Version/s | 10.2.3 [ 22115 ] |
Component/s | Prepared Statements [ 10804 ] |
Workflow | MariaDB v3 [ 77177 ] | MariaDB v4 [ 150950 ] |
A similar problem is repeatable using a non-BMP character.
Due to a bug in JIRA, the real BMP character was replaced to 'X' in the below script.
To get the real character displayed on a utf8 Linux console, run the following query:
Then copy and paste the character instead of all 'X' below.
1. Convensional execution works fine:
+------+
| c |
+------+
| X |
+------+
2. Prepared execution using a user variable as a source also works fine:
+------+
| c |
+------+
| X |
+------+
3. Prepared execution using a string literal as a source return a wrong result:
+---+
| c |
+---+
| ? |
+---+