[MDEV-31539] JSON returns incorrect result when character_set_database is set to certain character sets Created: 2023-06-26  Updated: 2023-11-28

Status: Open
Project: MariaDB Server
Component/s: JSON
Affects Version/s: 10.9, 10.10, 10.11, 11.0, 11.1, 11.2
Fix Version/s: 10.11, 11.0, 11.1

Type: Bug Priority: Major
Reporter: Ramesh Sivaraman Assignee: Rucha Deodhar
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Testcase

11.2.0-dbg>SET character_set_database=ucs2;
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SET CHARACTER SET DEFAULT;
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SELECT JSON_OVERLAPS('{"A": 1, "B": {"C":2}}', '{"A": 2, "B": {"C":2}}') AS is_overlap;
+------------+
| is_overlap |
+------------+
|          0 |
+------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>

Expected result

11.2.0-dbg>SELECT JSON_OVERLAPS('{"A": 1, "B": {"C":2}}', '{"A": 2, "B": {"C":2}}') AS is_overlap;
+------------+
| is_overlap |
+------------+
|          1 |
+------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>



 Comments   
Comment by Sergei Golubchik [ 2023-06-30 ]

is it JSON_OVERLAPS specific issue? all other JSON* functions work correctly?

Comment by Ramesh Sivaraman [ 2023-07-03 ]

The JSON functions used for true/false comparisons look good except for the new JSON_SCHEMA_VALID function.

11.2.0-dbg>SET @schema= '{
    '>   "properties" : {
    '>     "number1":{ "maximum":10 },
    '>     "string1" : { "maxLength": 3} 
    '>   }
    '> }';
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>
11.2.0-dbg>SELECT JSON_SCHEMA_VALID(@schema, '{ "number1":25, "string1":"ab" }');
+----------------------------------------------------------------+
| JSON_SCHEMA_VALID(@schema, '{ "number1":25, "string1":"ab" }') |
+----------------------------------------------------------------+
|                                                              1 |
+----------------------------------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>
 
Expected result
11.2.0-dbg>SELECT JSON_SCHEMA_VALID(@schema, '{ "number1":25, "string1":"ab" }');
+----------------------------------------------------------------+
| JSON_SCHEMA_VALID(@schema, '{ "number1":25, "string1":"ab" }') |
+----------------------------------------------------------------+
|                                                              0 |
+----------------------------------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>

Comment by Ramesh Sivaraman [ 2023-07-03 ]

Initially I checked only true/false comparison JSON functions. But there are issues when reading values using some JSON functions when setting character_set_database to ucs2 , some JSON functions give proper warning messages.

11.2.0-dbg>SET character_set_database=ucs2;
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SET CHARACTER SET DEFAULT;
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SELECT JSON_OBJECT("id", 1, "name", "Monty");
+---------------------------------------+
| JSON_OBJECT("id", 1, "name", "Monty") |
+---------------------------------------+
| ?id": ??, ???????�Monty?               |
+---------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>SELECT Json_Array(56, 3.1416, 'My name is "Foo"', NULL);
+--------------------------------------------------+
| Json_Array(56, 3.1416, 'My name is "Foo"', NULL) |
+--------------------------------------------------+
| ?56??�3.1416??�My name is \"Foo\"???????           |
+--------------------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>

Some JSON functions that return warning messages.

11.2.0-dbg>SELECT JSON_MERGE(@json1,@json2);
+---------------------------+
| JSON_MERGE(@json1,@json2) |
+---------------------------+
| NULL                      |
+---------------------------+
1 row in set, 1 warning (0.000 sec)
 
11.2.0-dbg>show warnings;
+---------+------+--------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                    |
+---------+------+--------------------------------------------------------------------------------------------+
| Warning | 4036 | Character disallowed in JSON in argument 1 to function 'json_merge_preserve' at position 2 |
+---------+------+--------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>
 
11.2.0-dbg>SET @json = '{ "A": 0, "B": [1, 2]}';
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>
11.2.0-dbg>SELECT JSON_INSERT(@json, '$.C', '[3, 4]');
+-------------------------------------+
| JSON_INSERT(@json, '$.C', '[3, 4]') |
+-------------------------------------+
| NULL                                |
+-------------------------------------+
1 row in set, 1 warning (0.000 sec)
 
11.2.0-dbg>show warnings;
+---------+------+----------------------------------------------------------------------------------+
| Level   | Code | Message                                                                          |
+---------+------+----------------------------------------------------------------------------------+
| Warning | 4038 | Syntax error in JSON text in argument 1 to function 'json_insert' at position 48 |
+---------+------+----------------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>

Comment by Ramesh Sivaraman [ 2023-07-04 ]

Some other character sets also have this issue

11.2.0-dbg>SELECT JSON_ARRAY('1. ě 2. š 3. č 4. ř 5. ž 6. ý 7. á 8. í 9. é 10. ů 11. ú') AS json_data;
+-------------------------------------------------------------------------+
| json_data                                                               |
+-------------------------------------------------------------------------+
| ["1. ě 2. š 3. č 4. ř 5. ž 6. ý 7. á 8. í 9. é 10. ů 11. ú"]            |
+-------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>SET character_set_database=big5;   
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SET CHARACTER SET DEFAULT;
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SELECT JSON_ARRAY('1. ě 2. š 3. č 4. ř 5. ž 6. ý 7. á 8. í 9. é 10. ů 11. ú') AS json_data;
+-------------------------------------------------------------------------+
| json_data                                                               |
+-------------------------------------------------------------------------+
| ["1. ?? 2. ?? 3. ?? 4. ?? 5. ?? 6. ?? 7. ?? 8. ?? 9. ?? 10. ?? 11. ??"] |
+-------------------------------------------------------------------------+
1 row in set (0.022 sec)
 
11.2.0-dbg>
-------------------------------
11.2.0-dbg>SET character_set_database=cp932;   
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SET CHARACTER SET DEFAULT;
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SELECT JSON_ARRAY('1. ě 2. š 3. č 4. ř 5. ž 6. ý 7. á 8. í 9. é 10. ů 11. ú') AS json_data;
+-------------------------------------------------------------------------+
| json_data                                                               |
+-------------------------------------------------------------------------+
| ["1. ?? 2. ?? 3. ?? 4. ?? 5. ?? 6. ?? 7. ?? 8. ?? 9. ?? 10. ?? 11. ??"] |
+-------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>
 
----------------------------
 
11.2.0-dbg>SET character_set_database=euckr;   
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SET CHARACTER SET DEFAULT;
Query OK, 0 rows affected (0.000 sec)
 
11.2.0-dbg>SELECT JSON_ARRAY('1. ě 2. š 3. č 4. ř 5. ž 6. ý 7. á 8. í 9. é 10. ů 11. ú') AS json_data;
+-------------------------------------------------------------------------+
| json_data                                                               |
+-------------------------------------------------------------------------+
| ["1. ?? 2. ?� 3. ?? 4. ?� 5. ?� 6. ?� 7. ?� 8. ?� 9. ?? 10. ?? 11. ?�"]        |
+-------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
11.2.0-dbg>
11.2.0-dbg>

Generated at Thu Feb 08 10:24:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.