[MDEV-24784] JSON_ARRAYAGG charset issue Created: 2021-02-04  Updated: 2023-11-19  Resolved: 2023-11-19

Status: Closed
Project: MariaDB Server
Component/s: Character Sets, JSON
Affects Version/s: 10.5.8, 10.5
Fix Version/s: 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3

Type: Bug Priority: Major
Reporter: Thomas K. Assignee: Sergei Golubchik
Resolution: Fixed Votes: 3
Labels: None
Environment:

Ubuntu 18.04 LTS


Issue Links:
Relates
relates to MDEV-16620 Add support for JSON_ARRAYAGG and JSO... Closed

 Description   

JSON_ARRAYAGG() doesn't evaluate correct charset.

reproduce:

MariaDB [test]> CREATE TABLE t1 (a varchar(5));
Query OK, 0 rows affected (0.035 sec)
 
MariaDB [test]> INSERT INTO t1 VALUES ('a'),('ä');
Query OK, 2 rows affected (0.014 sec)
 
MariaDB [test]> select * from t1;
+------+
| a    |
+------+
| a    |
| ä    |
+------+
2 rows in set (0.000 sec)
 
MariaDB [test]> SELECT JSON_ARRAYAGG(a) FROM t1;
+------------------+
| JSON_ARRAYAGG(a) |
+------------------+
| ["a","▒"]         |
+------------------+
1 row in set (0.001 sec)
 
MariaDB [test]> SELECT JSON_ARRAY(a) FROM t1;
+---------------+
| JSON_ARRAY(a) |
+---------------+
| ["a"]         |
| ["ä"]         |
+---------------+
2 rows in set (0.000 sec)
 
MariaDB [test]> select @@version;
+--------------------------------------+
| @@version                            |
+--------------------------------------+
| 10.5.8-MariaDB-1:10.5.8+maria~bionic |
+--------------------------------------+
 
MariaDB [(none)]> select @@character_set_server,@@character_set_client,@@character_set_results,@@collation_connection;
+------------------------+------------------------+-------------------------+------------------------+
| @@character_set_server | @@character_set_client | @@character_set_results | @@collation_connection |
+------------------------+------------------------+-------------------------+------------------------+
| utf8mb4                | utf8                   | utf8                    | utf8_general_ci        |
+------------------------+------------------------+-------------------------+------------------------+
1 row in set (0.000 sec)



 Comments   
Comment by Alice Sherepa [ 2021-02-04 ]

MariaDB [test]> set names latin1;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> SELECT JSON_ARRAYAGG('ä');
+---------------------+
| JSON_ARRAYAGG('ä')  |
+---------------------+
| ["ä"]               |
+---------------------+
1 row in set (0.000 sec)
 
MariaDB [test]> set names utf8;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> SELECT JSON_ARRAYAGG('ä');
+---------------------+
| JSON_ARRAYAGG('ä')  |
+---------------------+
| ["�"]                |
+---------------------+
1 row in set (0.000 sec)
 
MariaDB [test]> SELECT 'ä';
+----+
| ä  |
+----+
| ä  |
+----+
1 row in set (0.000 sec)

Comment by Pavel Goran [ 2023-10-24 ]

This bug makes JSON_ARRAYAGG pretty much unusable for working with string data. (It seems to work fine with data from JSON columns, though.)

Comment by Pavel Goran [ 2023-10-24 ]

JSON_OBJECTAGG is apparently not affected by this bug. Which allows to use it as a partial work-around (with DISTINCT, ORDER BY and other JSON_OBJECTAGG's aggregation clauses unavailable):

MariaDB [...]> SELECT JSON_EXTRACT(JSON_OBJECTAGG('1', a), '$.*') FROM t1;
+---------------------------------------------+
| JSON_EXTRACT(JSON_OBJECTAGG('1', a), '$.*') |
+---------------------------------------------+
| ["a", "ä"]                                  |
+---------------------------------------------+
1 row in set (0.000 sec)
 
MariaDB [...]> select @@version;
+---------------------+
| @@version           |
+---------------------+
| 10.6.15-MariaDB-log |
+---------------------+
1 row in set (0.000 sec)

Generated at Thu Feb 08 09:32:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.