[MDEV-8949] COLUMN_CREATE unicode name breakage Created: 2015-10-15 Updated: 2017-11-14 Resolved: 2017-11-14 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Character Sets, Dynamic Columns |
| Affects Version/s: | 10.0.21 |
| Fix Version/s: | 10.0.34, 10.1.29, 10.2.11, 10.3.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Adam Johnson | Assignee: | Oleksandr Byelkin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Seen on both OS X 10.9 and Ubuntu 14.04 |
||
| Sprint: | 10.2.11 |
| Description |
|
Possibly related to When calling column_create with names set to utf8, one can successfully
However if the connection is set to utf8mb4 this actually fails,
Other unicode characters work fine though:
|
| Comments |
| Comment by Elena Stepanova [ 2015-10-20 ] | |||||||||||||||||||||
|
Reproducible as described. There is also a simpler example which does not involve dynamic columns but might have the same root cause (or not?):
Note the difference in the column name. Assigning to bar who should be able to shed some light on it. | |||||||||||||||||||||
| Comment by Adam Johnson [ 2015-10-21 ] | |||||||||||||||||||||
|
Seems related - emojis becoming ? on utf8mb4 with mysqldump: | |||||||||||||||||||||
| Comment by Alexander Barkov [ 2017-10-07 ] | |||||||||||||||||||||
|
The problem happens because the column_json related code uses in Item_func_dyncol_json and in mysys/ma_dyncol.c used &my_charset_utf8_general_ci, which supports Unicode characters in the BMP range U+0000..U+FFFF. Emojii is outside this range. Perhaps, it should be fixed to use &my_charset_utf8mb4_general_ci instead. But I'm not sure. | |||||||||||||||||||||
| Comment by Alexander Barkov [ 2017-10-07 ] | |||||||||||||||||||||
|
It seems Jira does not support non-BMP characters.
| |||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2017-11-13 ] | |||||||||||||||||||||
|
revision-id: e22c33e3f014ffc4d7c08d6830f710c19f1aff90 (mariadb-10.0.33-17-ge22c33e3f01)
Use utf-mb4 if it is possible. — | |||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2017-11-13 ] | |||||||||||||||||||||
|
github tree: bb-10.0- | |||||||||||||||||||||
| Comment by Alexander Barkov [ 2017-11-14 ] | |||||||||||||||||||||
|
There is one more character set related problem with COLUMN_LIST() and COLUMN_GET():
Notice, these functions create longblob columns. | |||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2017-11-14 ] | |||||||||||||||||||||
|
revision-id: 2913f615f050f356f7be178e5d91650b86b33e4e (mariadb-10.0.33-17-g2913f615f05)
Use utf-mb4 if it is possible. — | |||||||||||||||||||||
| Comment by Alexander Barkov [ 2017-11-14 ] | |||||||||||||||||||||
|
This patch is OK to push: revision-id: 2913f615f050f356f7be178e5d91650b86b33e4e (mariadb-10.0.33-17-g2913f615f05) |