[MDEV-19059] Misleading ER_INVALID_CHARACTER_STRING message for DDL statements Created: 2019-03-27 Updated: 2023-04-27 |
|
| Status: | Confirmed |
| Project: | MariaDB Server |
| Component/s: | Character Sets, Data Definition - Alter Table, Data Definition - Create Table |
| Affects Version/s: | 10.2, 10.3, 10.4 |
| Fix Version/s: | 10.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Marko Mäkelä | Assignee: | Alexander Barkov |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | character-set, error | ||
| Description |
|
The system_charset_info that MariaDB uses for schema object names (such as tables, indexes and columns) only supports the Basic Multilingual Plane (Unicode code points below U+10000), that is, at most 3 bytes per character. In MariaDB 10.0 and 10.1, the test innodb.innodb-alter returns the following error message to 4-byte UTF-8 data:
This is somewhat misleading, because the quoted byte sequence is valid UTF-8. We could still claim that utf8 is an alias to utf8mb3 and say that this is not an error. However, starting with MariaDB 10.2, the error message is unambiguously misleading:
The 4-byte sequence is pretty much valid in utf8mb4. The error message should convey the correct meaning that while this sequence is valid utf8mb4, only utf8mb3 is valid for a schema object name. At the very least we should change the utf8mb4 to utf8mb3. It would be better to say something like ‘invalid schema object name’. Unfortunately, I cannot quote the exact statement that produces the above error, because Jira does not allow 4-byte UTF-8 data. Here is a stand-alone test case that works around the Jira limitation:
This will return the same misleading error message that is returned for the statement in innodb.innodb-alter:
|