Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.3, 10.4, 10.5
-
None
Description
There is a huge code duplication in:
- Lex_input_stream::scan_ident_start()
- Lex_input_stream::scan_ident_sysvar()
The latter seems to be buggy, as it does not handle bad bytes correctly.
This can be demonstrated in the following queries:
EXECUTE IMMEDIATE CONCAT('SELECT @@session',0xC2,'.autocommit'); |
ERROR 1300 (HY000): Invalid utf8 character string: 'session\xC2'
|
EXECUTE IMMEDIATE CONCAT('SELECT @@session',0xFF,'.autocommit'); |
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '?.autocommit' at line 1
|
The former error message is wrong. The 0xC2 is errorneously scanned as a part of identifier, although it is not followed by a valid multi-byte tail.
The latter error message is correct. 0xFF cannot be a part of an UTF8 sequence, so the tokenizer scans 'session' as expected, then fails to get a token during the next lex_one_token() call.
The wrong method scan_ident_sysvar() should be removed, and scan_ident_start() should be used instead.
Attachments
Issue Links
- is blocked by
-
MDEV-23037 Multibyte character sets parse identifiers slow
- Open