[MDEV-6520] Unexpected syntax error with VIEW + gbk + 0x5C Created: 2014-08-01  Updated: 2015-07-07

Status: Open
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.3.12, 5.5.38, 10.0.12
Fix Version/s: 10.1

Type: Bug Priority: Minor
Reporter: Alexander Barkov Assignee: Alexander Barkov
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Fedora


Issue Links:
Blocks
is blocked by MDEV-6643 Improve performance of string process... Stalled

 Description   

1. Run a new gnome-terminal window.
2. Set character set to GBK:
Terminal -> Character Encoding -> Simplified Chinese (GBK)

3. Run this query to make sure it worked fine:

export LANG=zh_CN.gbk
mysql --default-character-set=gbk test << END
SELECT HEX('怽');
END

The expected output is:

HEX('怽')
905C

If you get some different hex code, then something went wrong with the character set settings. Make sure to check previous steps.

4. Run this command:

export LANG=zh_CN.gbk
mysql --default-character-set=gbk test << END
DROP VIEW IF EXISTS v1;
CREATE VIEW v1 AS SELECT 'abcэюя痢立';
SHOW CREATE VIEW v1;
END

It works fine and prints:

View	Create View	character_set_client	collation_connection
v1	CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v1` AS select 'abcэюя痢立' AS `abcэюя痢立`	gbk	gbk_chinese_ci

5. Now run a similar command but using a different string:

export LANG=zh_CN.gbk
mysql --default-character-set=gbk test << END
DROP VIEW IF EXISTS v1;
CREATE VIEW v1 AS SELECT '怽';
SHOW CREATE VIEW v1;
END

It fails with this error:

ERROR 1064 (42000) at line 3: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''怽\' AS `\`' at line 1

The same problem happens with any GBK character that has 0x5C as the second byte in a multi-byte character.

It seems that this happens because String::append_for_single_quote() in sql_string.cc does not
handle multi-byte characters properly. It escapes all bytes 0x5C and does not distinguish between
real backslash (ASCII 0x5C) and when 0x5C is actually the second byte in a multi-byte character
(not a backslash).


Generated at Thu Feb 08 07:12:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.