[MDEV-16750] JSON_SET mishandles unicode every second pair of arguments Created: 2018-07-13  Updated: 2018-08-06  Resolved: 2018-08-06

Status: Closed
Project: MariaDB Server
Component/s: JSON
Affects Version/s: 10.2.15, 10.2, 10.3
Fix Version/s: 10.2.17

Type: Bug Priority: Major
Reporter: David Flatz Assignee: Alexey Botchkov
Resolution: Fixed Votes: 0
Labels: None
Environment:

openSUSE Tumbleweed 20180530



 Description   

+underlined text+

> show variables like "char%";
+--------------------------+------------------------------+
| Variable_name            | Value                        |
+--------------------------+------------------------------+
| character_set_client     | utf8mb4                      |
| character_set_connection | utf8mb4                      |
| character_set_database   | utf8mb4                      |
| character_set_filesystem | binary                       |
| character_set_results    | utf8mb4                      |
| character_set_server     | utf8mb4                      |
| character_set_system     | utf8                         |
| character_sets_dir       | /usr/share/mariadb/charsets/ |
+--------------------------+------------------------------+

ö in the terminal should be in UTF-8:

$ echo -n "ö" | od -c
0000000 303 266
0000002

JSON_SET with only one pair of arguments works fine

$ mysql -N -e "SELECT JSON_SET('{}', '$.a', 'ö');" | od -c
0000000   {   "   a   "   :       " 303 266   "   }  \n
0000014

with two pairs the second unicode character produces a LATIN1 ö

$ mysql -N -e "SELECT JSON_SET('{}', '$.a', 'ö', '$.b', 'ö');" | od -c
0000000   {   "   a   "   :       " 303 266   "   ,       "   b   "   :
0000020       " 366   "   }  \n
0000026

which results into this

> SELECT JSON_SET('{}', '$.a', 'ö', '$.b', 'ö', '$.c', 1);
+----------------------------------------------------+
| JSON_SET('{}', '$.a', 'ö', '$.b', 'ö', '$.c', 1)   |
+----------------------------------------------------+
| NULL                                               |
+----------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
 
> show warnings;
+---------+------+------------------------------------------------------------------------+
| Level   | Code | Message                                                                |
+---------+------+------------------------------------------------------------------------+
| Warning | 4035 | Broken JSON string in argument 1 to function 'json_set' at position 16 |
+---------+------+------------------------------------------------------------------------+
1 row in set (0.00 sec)

however if we only use special unicode characters every second pair

> SELECT JSON_SET('{}', '$.a', 'ö', '$.c', 1, '$.b', 'ö');
+----------------------------------------------------+
| JSON_SET('{}', '$.a', 'ö', '$.c', 1, '$.b', 'ö')   |
+----------------------------------------------------+
| {"a": "ö", "c": 1, "b": "ö"}                       |
+----------------------------------------------------+
1 row in set (0.00 sec)

something similar happens with mb4 characters (example with U+1F62B TIRED FACE, JIRA doesn't handle those characters as well so i replaced it with XXXX)

$ echo -n "XXXX" | od -c
0000000 360 237 230 253
0000004
$ mysql -N -e "SELECT JSON_SET('{}', '$.a', 'XXXX');" | od -c
0000000   {   "   a   "   :       " 360 237 230 253   "   }  \n
0000016

however

> SELECT JSON_SET('{}', '$.a', 'XXXX', '$.b', 'XXXX');
+----------------------------------------+
| JSON_SET('{}', '$.a', '?', '$.b', '?') |
+----------------------------------------+
| NULL                                   |
+----------------------------------------+
1 row in set, 1 warning (0.00 sec)
 
> show warnings;
+---------+------+-------------------------------------------------------------------------------+
| Level   | Code | Message                                                                       |
+---------+------+-------------------------------------------------------------------------------+
| Warning | 4038 | Syntax error in JSON text in argument 1 to function 'json_set' at position 24 |
+---------+------+-------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
> SELECT JSON_SET('{}', '$.a', 'XXXX', '$.b', 1, '$.c', 'XXXX');
+--------------------------------------------------+
| JSON_SET('{}', '$.a', '?', '$.b', 1, '$.c', '?') |
+--------------------------------------------------+
| {"a": "XXXX", "b": 1, "c": "XXXX"}                     |
+--------------------------------------------------+
1 row in set (0.00 sec)



 Comments   
Comment by Elena Stepanova [ 2018-07-16 ]

Thanks for the report.

Comment by Alexey Botchkov [ 2018-08-06 ]

http://lists.askmonty.org/pipermail/commits/2018-August/012775.html

Generated at Thu Feb 08 08:31:16 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.