[MDEV-23004] When using GROUP BY with JSON_ARRAYAGG with joint table, the square brackets are not included Created: 2020-06-24  Updated: 2022-11-11  Resolved: 2021-06-28

Status: Closed
Project: MariaDB Server
Component/s: JSON
Affects Version/s: 10.5.4
Fix Version/s: 10.5.12

Type: Bug Priority: Critical
Reporter: Dávid Szöke Assignee: Alexey Botchkov
Resolution: Fixed Votes: 2
Labels: JSON, JSON_ARRAYAGG, MariaDB
Environment:

Ubuntu 18.04LTS Server, MariaDB 10.5.4


Issue Links:
Relates
relates to MDEV-22084 Squared brackets missing from JSON_AR... Closed

 Description   

When using without the GROUP BY, the square brackets are there, but when I add GROUP BY to the query the square brackets disappear.

create table owner(
    id int primary key,
    name varchar(50)
);
 
create table test_table(
    id int primary key,
    type varchar(50),
    size int,
    color varchar(50),
    owner_id int,
    FOREIGN KEY(owner_id) REFERENCES owner(id)
);
 
SELECT name, owner.id, JSON_ARRAYAGG(JSON_OBJECT('id',tt.id,'type',tt.type,'color',tt.color)) as materials from owner LEFT JOIN test_table tt on owner.id = tt.owner_id GROUP BY owner.id;

Expected result:

Name Id Materials
David 1 [{"id": 2, "type": "metal", "color": "blue"},{"id": 1, "type": "paper", "color": "red"}]
Joe 2 [{"id": 3, "type": "wood", "color": "green"}]
John 3 [{"id": 4, "type": "plastic", "color": "yellow"}]

Actual result:

Name Id Materials
David 1 {"id": 2, "type": "metal", "color": "blue"}

,

{"id": 1, "type": "paper", "color": "red"}
Joe 2 {"id": 3, "type": "wood", "color": "green"}
John 3 {"id": 4, "type": "plastic", "color": "yellow"}


 Comments   
Comment by Varun Gupta (Inactive) [ 2020-06-29 ]

Here is a test case with values

CREATE TABLE t1(id int primary key, name varchar(50));
CREATE TABLE t2(id int, owner_id int);
 
INSERT INTO t1 VALUES (1, "name1"), (2, "name2"), (3, "name3");
INSERT INTO t2 VALUES (1, 1), (2, 1), (3, 2), (4, 3);
 
SELECT t1.id, JSON_ARRAYAGG(JSON_OBJECT('id',t2.id)) as materials
from t1 LEFT JOIN t2 on t1.id = t2.owner_id
GROUP BY t1.id;

Comment by Varun Gupta (Inactive) [ 2020-06-29 ]

I tried to make a fix for this but when i run the query I get this output

MariaDB [test]> SELECT t1.id, JSON_ARRAYAGG(JSON_OBJECT('id',t2.id)) as materials
    -> from t1 LEFT JOIN t2 on t1.id = t2.owner_id
    -> GROUP BY t1.id;
+----+-------------------------------+
| id | materials                     |
+----+-------------------------------+
|  1 | ["{\"id\": 1}","{\"id\": 2}"] |
|  2 | ["{\"id\": 3}"]               |
|  3 | ["{\"id\": 4}"]               |
+----+-------------------------------+
3 rows in set (0.003 sec)

The json value here is printed in text form.
Looks like the fix for MDEV-22837 is not complete.

Comment by Dávid Szöke [ 2020-06-29 ]

If I run this query, it is giving me the same result as for you, with brackets. But if I run the original query, it is giving me the result that I got, without the brackets. And also what I noticed is that if I select only the id, it works, but if I select anything else, even if it is alone in the JSON, it loses the brackets.

SELECT owner.id, JSON_ARRAYAGG(JSON_OBJECT('type',tt.type)) as materials from owner LEFT JOIN test_table tt on owner.id = tt.owner_id GROUP BY owner.id;

id materials
1 {"type": "paper"}

,

{"type": "metal"}
2 {"type": "wood"}
3 {"type": "plastic"}

SELECT owner.id, JSON_ARRAYAGG(JSON_OBJECT('id',tt.id)) as materials from owner LEFT JOIN test_table tt on owner.id = tt.owner_id GROUP BY owner.id;

id materials
1 [{"id": 1},{"id": 2}]
2 [{"id": 3}]
3 [{"id": 4}]
Comment by Varun Gupta (Inactive) [ 2020-06-29 ]

Hi davidszoke I think the problem is reproducible with my case too, though i tested with the MYISAM storage engine. The case where the query does not work is when we use the temporary table to store the join and then perform the grouping in accordance to the GROUP BY clause.

The EXPLAIN shows

MariaDB [test]> EXPLAIN
    -> SELECT t1.id, JSON_ARRAYAGG(JSON_OBJECT('id',t2.id)) as materials
    -> from t1 LEFT JOIN t2 on t1.id = t2.owner_id
    -> GROUP BY t1.id;
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------------------------------------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                                           |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------------------------------------------+
|    1 | SIMPLE      | t1    | index | NULL          | PRIMARY | 4       | NULL | 3    | Using index; Using temporary; Using filesort    |
|    1 | SIMPLE      | t2    | ALL   | NULL          | NULL    | NULL    | NULL | 4    | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------------------------------------------+
2 rows in set (0.004 sec)

and the result is like

MariaDB [test]> SELECT t1.id, JSON_ARRAYAGG(JSON_OBJECT('id',t2.id)) as materials from t1 LEFT JOIN t2 on t1.id = t2.owner_id GROUP BY t1.id; 
+----+---------------------+
| id | materials           |
+----+---------------------+
|  1 | {"id": 1},{"id": 2} |
|  2 | {"id": 3}           |
|  3 | {"id": 4}           |
+----+---------------------+
3 rows in set (0.005 sec)

Comment by Elena Stepanova [ 2020-07-01 ]

Here is a test case without JOIN which loses brackets in a similar way:

CREATE TABLE t1 (b VARCHAR(10));
INSERT INTO t1 VALUES ("foo"),("bar");
SELECT SQL_BUFFER_RESULT JSON_ARRAYAGG(b) FROM t1;

Actual result, 10.5 baca52655

MariaDB [test]> SELECT SQL_BUFFER_RESULT JSON_ARRAYAGG(b) FROM t1;
+------------------+
| JSON_ARRAYAGG(b) |
+------------------+
| foo,bar          |
+------------------+
1 row in set (0.002 sec)

Same with sql_buffer_result variable instead of the SELECT option.
Without sql_buffer_result the result is as expected:

MariaDB [test]> SELECT JSON_ARRAYAGG(b) FROM t1;
+------------------+
| JSON_ARRAYAGG(b) |
+------------------+
| ["foo","bar"]    |
+------------------+
1 row in set (0.001 sec)

Comment by Neil Hughes [ 2021-03-14 ]

Relieved to find this thread. I've been having problems for days with inconsistent results from JSON_ARRAYAGG on two servers, one running 10.5.7 and one running 10.5.8. I assumed I was doing something wrong but it appears this is a known issue since last year.

As others have said above, sometimes the square brackets are missing entirely - this seems to be related to the GROUP BY clause as mentioned above..

And sometimes selecting a varchar column it includes the square brackets but misses out the quotes, so I end up with invalid JSON arrays such as [these, should, be, in, quotes].
(I don't have the MariaDB knowhow to determine what causes this.)

I've put together a hack in my code to massage the output to the correct format but it feels fragile so I thought I'd ask if there is a fix in the pipeline?

Comment by Alexey Botchkov [ 2021-06-28 ]

https://github.com/MariaDB/server/commit/98c7916f0f29fb20cf2b8c8fa498f6033a07a89c

Generated at Thu Feb 08 09:19:07 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.