[MDEV-10867] PREPARE..EXECUTE is not consistent about non-ASCII characters - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL)
Fix Version/s: 10.2.3, 10.3.0
Component/s: Character Sets, Prepared Statements
Labels:
None

Description

I start a 8-bit console.
In gnome-terminal I make sure that the console character set is properly set to iso-8859-1:
Terminal -> Character Set Encoding -> Western (ISO-8859-1).

Now I run this command in shell:

$ LANG=en_US.iso88591 mysql test

and check that the client correctly detected the session character set as latin1:

SHOW VARIABLES LIKE 'character\_set\_%';

+--------------------------+--------+

| Variable_name            | Value  |

+--------------------------+--------+

| character_set_client     | latin1 | <--

| character_set_connection | latin1 | <--

| character_set_database   | latin1 |

| character_set_filesystem | binary |

| character_set_results    | latin1 | <--

| character_set_server     | latin1 |

| character_set_system     | utf8   |

+--------------------------+--------+

Now I run:

SELECT HEX('ä');

+----------+

| HEX('ä') |

+----------+

| E4       |

+----------+

It correctly returns 0xE4, which is a latin1 code for "U+00E4 SMALL LETTER A WITH DIAERESIS".

Now I use a prepared statement with a user variable as a source:

SET @src='SELECT HEX(''ä'')';

PREPARE stmt FROM @src; EXECUTE stmt;

+----------+

| HEX('ä') |

+----------+

| E4       |

+----------+

It also returns the same latin1 code. So far so good.

Now I use a prepared statement with a string literal as a source:

PREPARE stmt FROM 'SELECT HEX(''ä'')';

EXECUTE stmt;

+-----------+

| HEX('Ã¤') |

+-----------+

| C3A4      |

+-----------+

It returns a different result. C3A4 is a utf8 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Conversion from latin1 to utf8 happened.

Attachments

Activity

Ascending order - Click to sort in descending order

Alexander Barkov created issue - 2016-09-22 11:46

Alexander Barkov made changes - 2016-09-22 11:46

Field	Original Value	New Value
Description	I start a 8-bit console. In gnome-terminal I make sure that the console character set is properly set to iso-8859-1: Terminal -> Character Set Encoding -> Western (ISO-8859-1). Now I run this command in shell: {noformat} $ LANG=en_US.iso88591 mysql test {noformat} and check that the cloent correctly detected the session character set as latin1: {code:sql} SHOW VARIABLES LIKE 'character\_set\_%'; {code} {noformat} +--------------------------+--------+ \| Variable_name \| Value \| +--------------------------+--------+ \| character_set_client \| latin1 \| <-- \| character_set_connection \| latin1 \| <-- \| character_set_database \| latin1 \| \| character_set_filesystem \| binary \| \| character_set_results \| latin1 \| <-- \| character_set_server \| latin1 \| \| character_set_system \| utf8 \| +--------------------------+--------+ {noformat} Now I run: {code:sql} SELECT HEX('ä'); {code} {noformat} +----------+ \| HEX('ä') \| +----------+ \| E4 \| +----------+ {noformat} It correctly returns 0xE4, which is a latin1 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Now I use a prepared statement with a user variable as a source: {code:sql} SET @src='SELECT HEX(''ä'')'; PREPARE stmt FROM @src; EXECUTE stmt; {code} {noformat} +----------+ \| HEX('ä') \| +----------+ \| E4 \| +----------+ {noformat} It also returns the same latin1 code. So far so good. Now I use a prepared statement with a string literal as a source: {code:sql} PREPARE stmt FROM 'SELECT HEX(''ä'')'; EXECUTE stmt; {code} {noformat} +-----------+ \| HEX('Ã¤') \| +-----------+ \| C3A4 \| +-----------+ {noformat} It returns a different result. C3A4 is a utf8 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Conversion from latin1 to utf8 happened.	I start a 8-bit console. In gnome-terminal I make sure that the console character set is properly set to iso-8859-1: Terminal -> Character Set Encoding -> Western (ISO-8859-1). Now I run this command in shell: {noformat} $ LANG=en_US.iso88591 mysql test {noformat} and check that the client correctly detected the session character set as latin1: {code:sql} SHOW VARIABLES LIKE 'character\_set\_%'; {code} {noformat} +--------------------------+--------+ \| Variable_name \| Value \| +--------------------------+--------+ \| character_set_client \| latin1 \| <-- \| character_set_connection \| latin1 \| <-- \| character_set_database \| latin1 \| \| character_set_filesystem \| binary \| \| character_set_results \| latin1 \| <-- \| character_set_server \| latin1 \| \| character_set_system \| utf8 \| +--------------------------+--------+ {noformat} Now I run: {code:sql} SELECT HEX('ä'); {code} {noformat} +----------+ \| HEX('ä') \| +----------+ \| E4 \| +----------+ {noformat} It correctly returns 0xE4, which is a latin1 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Now I use a prepared statement with a user variable as a source: {code:sql} SET @src='SELECT HEX(''ä'')'; PREPARE stmt FROM @src; EXECUTE stmt; {code} {noformat} +----------+ \| HEX('ä') \| +----------+ \| E4 \| +----------+ {noformat} It also returns the same latin1 code. So far so good. Now I use a prepared statement with a string literal as a source: {code:sql} PREPARE stmt FROM 'SELECT HEX(''ä'')'; EXECUTE stmt; {code} {noformat} +-----------+ \| HEX('Ã¤') \| +-----------+ \| C3A4 \| +-----------+ {noformat} It returns a different result. C3A4 is a utf8 code for "U+00E4 SMALL LETTER A WITH DIAERESIS". Conversion from latin1 to utf8 happened.

Alexander Barkov added a comment - 2016-09-23 10:35 - edited

A similar problem is repeatable using a non-BMP character.
Due to a bug in JIRA, the real BMP character was replaced to 'X' in the below script.
To get the real character displayed on a utf8 Linux console, run the following query:

SELECT _utf8mb4 0xF09F988E;

Then copy and paste the character instead of all 'X' below.

1. Convensional execution works fine:

SET NAMES utf8mb4;

SELECT 'X' AS c;

+------+

| c    |

+------+

| X     |

+------+

2. Prepared execution using a user variable as a source also works fine:

SET @src='SELECT ''X'' AS c';

PREPARE stmt FROM @src;

EXECUTE stmt;

+------+

| c    |

+------+

| X     |

+------+

3. Prepared execution using a string literal as a source return a wrong result:

PREPARE stmt FROM 'SELECT ''X'' AS c';

EXECUTE stmt;

+---+

| c |

+---+

| ? |

+---+

Alexander Barkov added a comment - 2016-09-23 10:35 - edited A similar problem is repeatable using a non-BMP character. Due to a bug in JIRA, the real BMP character was replaced to 'X' in the below script. To get the real character displayed on a utf8 Linux console, run the following query: SELECT _utf8mb4 0xF09F988E; Then copy and paste the character instead of all 'X' below. 1. Convensional execution works fine: SET NAMES utf8mb4; SELECT 'X' AS c; +------+ | c | +------+ | X | +------+ 2. Prepared execution using a user variable as a source also works fine: SET @src= 'SELECT ' 'X' ' AS c' ; PREPARE stmt FROM @src; EXECUTE stmt; +------+ | c | +------+ | X | +------+ 3. Prepared execution using a string literal as a source return a wrong result: PREPARE stmt FROM 'SELECT ' 'X' ' AS c' ; EXECUTE stmt; +---+ | c | +---+ | ? | +---+

Alexander Barkov made changes - 2016-10-04 16:23

Status

Open [ 1 ]

In Progress [ 3 ]

Alexander Barkov made changes - 2016-10-04 16:24

Fix Version/s		10.3.0 [ 22127 ]
Fix Version/s	10.2 [ 14601 ]
Resolution		Fixed [ 1 ]
Status	In Progress [ 3 ]	Closed [ 6 ]

Alexander Barkov made changes - 2016-10-08 09:09

Fix Version/s

10.2.3 [ 22115 ]

Alexander Barkov made changes - 2017-04-09 03:22

Component/s

Prepared Statements [ 10804 ]

Sergei Golubchik made changes - 2021-12-06 21:43

Workflow

MariaDB v3 [ 77177 ]

MariaDB v4 [ 150950 ]

People

Assignee:: Alexander Barkov

Reporter:: Alexander Barkov

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2016-09-22 11:46

Updated:: 2017-04-09 03:22

Resolved:: 2016-10-04 16:24

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server