[MDEV-23037] Multibyte character sets parse identifiers slow Created: 2020-06-29 Updated: 2020-08-01 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Character Sets, Parser |
| Affects Version/s: | 10.3, 10.4, 10.5 |
| Fix Version/s: | 10.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | Alexander Barkov | Assignee: | Alexander Barkov |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
Note, the problem should be repeatable in the versions before 10.3, but they do not support EXECUTE IMMEDIATE. Lex_input_stream::scan_ident_start() calls charlen() excessively in case of a multi-byte character set:
I run an SQL statement with a lot of identifiers consisting of ASCII letters. With multi-byte character sets it gets much slower that with latin1.
We should consider adding a new virtual function in MY_CHARSET_HANDLER, to scan identifiers in one short. |