[MDEV-13701] Nesting JSON Created: 2017-09-01  Updated: 2021-11-10  Resolved: 2021-11-10

Status: Closed
Project: MariaDB Server
Component/s: JSON
Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Andrej Thomsen Assignee: Sergei Golubchik
Resolution: Fixed Votes: 11
Labels: None

Issue Links:
Relates
relates to MDEV-27018 IF and COALESCE lose "json" property Closed
relates to MDEV-26506 Over-quoted JSON when combining JSON_... Closed
Sprint: 10.2.12, 10.2.14

 Description   

JSON data is not getting recognized, so statements like the following result in broken JSON-strings. In MySQL it doesn't happen because JSON is recognized as a datatype. If adding JSON as a datatype is not an option to consider, then maybe add an optional parameter to "JSON_OBJECT()" to check the passed attributes to not escape them again. When advertising JSON-support in MariaDB 10.2 results in generating those strings manually via "CONCAT()" is not what people expect and this is the only reason I can't use MariaDB in nearly all of my projects.

Note: If testing in MySQL 5.7, cast the concatenated JSON-value of "Permission" as JSON manually, so it wont be escaped again.

SELECT
                JSON_OBJECT(
                    'Staff', JSON_OBJECT(
                        'Staff_ID', Staff_ID,
                        'StaffNumber', Staff_Number,
                        'StaffName', Staff_Name
                    ),
                    'Company', JSON_OBJECT(
                        'Company_ID', Company_ID,
                        'Description', Company_Name
                    ),
                    'Permissions', CONCAT(
                        '[',
                        GROUP_CONCAT(
                            JSON_OBJECT(Permission_ID, Permission_Flag)
                        ),
                        ']'
                    )
                ) AS StaffObj
FROM            Staff
NATURAL JOIN    Company
NATURAL JOIN    Staff_Permissions
GROUP BY        Staff_ID;



 Comments   
Comment by Elena Stepanova [ 2017-09-01 ]

Thanks for the report.
GROUP_CONCAT seems to be important here.
Simplified test case:

SET sql_mode='';
 
CREATE TABLE Staff (
  Staff_ID INT,
  Staff_Number INT,
  Staff_Name VARCHAR(64)
);
 
INSERT INTO Staff VALUES 
(1, 10, 'foo');
 
SELECT
  JSON_OBJECT(
    'Staff', JSON_OBJECT(
      'Staff_ID', Staff_ID,
      'StaffNumber', Staff_Number,
      'StaffName', Staff_Name
    ),
    'Staff_List', CONCAT('[',GROUP_CONCAT(Staff_Name),']')
  ) AS StaffObj
FROM Staff;
 
# Cleanup
DROP TABLE Staff;

10.2, 10.3

+----------------------------------------------------------------------------------------------------+
| StaffObj                                                                                           |
+----------------------------------------------------------------------------------------------------+
| {"Staff": "{\"Staff_ID\": 1, \"StaffNumber\": 10, \"StaffName\": \"foo\"}", "Staff_List": "[foo]"} |
+----------------------------------------------------------------------------------------------------+

MySQL 5.7

+------------------------------------------------------------------------------------------+
| StaffObj                                                                                 |
+------------------------------------------------------------------------------------------+
| {"Staff": {"Staff_ID": 1, "StaffName": "foo", "StaffNumber": 10}, "Staff_List": "[foo]"} |
+------------------------------------------------------------------------------------------+

Comment by T.Peter [ 2018-01-03 ]

It is not limited to GROUP_CONCAT. This happens always when using a function or a variable within JSON_OBJECT().

Very simple example with user-def variables (mariadb-10.2.11):

SET @sub = JSON_OBJECT("key", "value");
SELECT @sub;
SET @main = JSON_OBJECT("main", @sub);
SELECT @main;
 
# same with arrays
SET @sub = JSON_ARRAY("one", 2);
SELECT @sub;
SET @main = JSON_OBJECT("main", @sub);
SELECT @main;

JSON_OBJECT() seems not to expect another JSON as value (besides from another JSON_.... function). Therefore it puts it in double quotes which forces to escape all other double quotes within the JSON code.

JSON_MERGE() had the same problem in the past. See MDEV-11447

Comment by Sergei Golubchik [ 2018-06-10 ]

Unfortunately, it seems the correct behavior. That's what the standard says, as far as I understand, and it's what Oracle is doing. Here's an example in Oracle with variables in a stored function:

create or replace function foo return varchar as
  a varchar2(100);
begin
  select json_object('foo' is 'bar') into a from dual;
  select json_object('x' is a) into a from dual;
  return a;
end;

First, it stores a json object in a variable a, then it uses it in another JSON_OBJECT call. This returns

{"x" : "{"foo":"bar"}"}

But trying nested JSON_OBJECT calls directly, like in json_object('x' is json_object('foo' is 'bar')) nests json properly:

{"x" : {"foo" : "bar"}}

Using string concatenation breaks it, though:

select json_object('x' is '[' || json_object('foo' is 'bar') || ']') from dual

gives, again

{"x" : "[{"foo":"bar"}]"}

Comment by William Stam [ 2019-03-08 ]

been fighting this for the past 2 days. just want to add that the JSON_COMPACT(...) trick only works for the 2nd level. if you go to a 3rd level it stops working again.

SELECT 't' AS t, JSON_OBJECT(
   'a',"a",
   'b',"b",
   'c', JSON_COMPACT((
		SELECT
			JSON_OBJECT('c.1', 'c1', 'c.2', UPPER('c2'), 'c.3', UPPER((
				SELECT
					JSON_OBJECT('c.3.1', 'c31', 'c.3.2', 'c32', 'c.3.3', (
						SELECT JSON_OBJECT('c.3.1', 'c31', 'c.3.2', 'c32')
 
						))
 
				)))
		))
 
			 ) as col

im aware that leaving out the JSON_COMPACT at 'c', JSON_COMPACT(( solves it for this eg but my usecase uses subqueries etc and i explicitly need to force the type to json

 {"a": "a", "b": "b", "c": {"c.1": "c1", "c.2": "C2", "c.3": "{\"C.3.1\": \"C31\", \"C.3.2\": \"C32\", \"C.3.3\": {\"C.3.1\": \"C31\", \"C.3.2\": \"C32\"}}"}} 

(my query is more like)

 
 
SELECT d.`identifier`,
     CONVERT(
         JSON_OBJECT(
           'records', JSON_COMPACT((
           SELECT
             CONCAT('[',
                  GROUP_CONCAT(
                    JSON_OBJECT(
                      'id', d.id,
                      'batch_detail_id', d.batch_detail_id,
                      'created_date', d.created_date,
                      'identifier', d.identifier,
                      'batch', JSON_COMPACT((
                          SELECT JSON_OBJECT(
                            'name',batch_detail.name
                            )
                          FROM batch_detail WHERE batch_detail.id = em.batch_detail_id
                        ))
                      )
 
                    SEPARATOR ',')
               ,']')
             FROM batch_item_detail em WHERE em.identifier = d.identifier LIMIT 0,1
           )))
         USING utf8)
       as f
 
  FROM `batch_item_detail` d
  WHERE 1
  GROUP BY d.identifier
 
  LIMIT 0,5
;
// output. notice the batch key under records
{
	"records": [{
		"id": 1,
		"batch_detail_id": 1,
		"created_date": "2018-08-30 22:51:03",
		"identifier": "0000000070",
		"batch": "{\"name\":\"test statement batch\"}"
	}, {
		"id": 1,
		"batch_detail_id": 1,
		"created_date": "2018-08-30 22:51:03",
		"identifier": "0000000070",
		"batch": "{\"name\":\"test template 1152 batch\"}"
	}, {.............

Comment by Simone Roberto Nunzi [ 2020-03-09 ]

This error was driving me crazy, too.
It is not possible that this is considered a correct behavior. It is understandable if you think about how the JSON type has been implemented, but it is not admissible if you think about how a JSON should work and how it is managed in other databases. PostgreSQL for example, after the October update, is currently the best in JSON management. If even MySQL manages to adopt the correct behavior something is wrong...

I was able to find a convenient solution for this problem using JSON_QUERY. It's not the best solution and it loses a little bit of performance but it ensures that the JSONs returned by functions or nested selects are evaluated as such instead of strings.

Encapsulate selects and functions that return a JSON as string with JSON_QUERY( <function/select json>, '$'). This will parse and return a JSON from the string that you obtain, ensuring the expected output.

CREATE FUNCTION foo ()
RETURNS JSON
BEGIN
    RETURN json_object('a','b');
END;

select json_array(foo());

Gives

["{\"a\": \"b\"}"]

select json_array(JSON_QUERY(foo(),'$'));

gives

[{"a": "b"}]

It works with recursive stored procedures with multiple nested levels, where before I was getting an unimaginable amount of escape characters/strings (return buffer satured by \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ to escape nested of nested of nested of nested json's strings ....)

Hope it helps!

Comment by Lukas Eder [ 2020-04-15 ]

I ran into a similar problem using this query:

select json_object(
  'name', json_object(
    'firstName', `test`.`t_author`.`FIRST_NAME`,
    'lastName', `test`.`t_author`.`LAST_NAME`
  ),
  'books', concat(
    '[',
    group_concat(json_object('title', `test`.`t_book`.`TITLE`) separator ','),
    ']'
  )
)
from `test`.`t_author`
  join `test`.`t_book`
    on `test`.`t_author`.`ID` = `test`.`t_book`.`AUTHOR_ID`
group by
  `test`.`t_author`.`ID`,
  `test`.`t_author`.`FIRST_NAME`,
  `test`.`t_author`.`LAST_NAME`
order by `test`.`t_author`.`ID`

The results being:

{"name": "{\"firstName\": \"George\", \"lastName\": \"Orwell\"}", "books": "[{\"title\": \"1984\"},{\"title\": \"Animal Farm\"}]"}  
{"name": "{\"firstName\": \"Paulo\", \"lastName\": \"Coelho\"}", "books": "[{\"title\": \"O Alquimista\"},{\"title\": \"Brida\"}]"} 

A workaround that seems to work is to make heavy use of json_merge(), e.g. like this:

select 
  json_merge('{}',
    json_object(
      'name', json_object(
        'firstName', `test`.`t_author`.`FIRST_NAME`,
        'lastName', `test`.`t_author`.`LAST_NAME`
      )
    ),
    json_object(
      'books', json_merge('[]', concat(
        '[',
        group_concat(json_object('title', `test`.`t_book`.`TITLE`) separator ','),
        ']'
      ))
    )
  )
from `test`.`t_author`
  join `test`.`t_book`
    on `test`.`t_author`.`ID` = `test`.`t_book`.`AUTHOR_ID`
group by
  `test`.`t_author`.`ID`,
  `test`.`t_author`.`FIRST_NAME`,
  `test`.`t_author`.`LAST_NAME`
order by `test`.`t_author`.`ID`;

This now produces the expected JSON:

{"name": {"firstName": "George", "lastName": "Orwell"}, "books": [{"title": "1984"}, {"title": "Animal Farm"}]}  
{"name": {"firstName": "Paulo", "lastName": "Coelho"}, "books": [{"title": "O Alquimista"}, {"title": "Brida"}]} 

We'll implement this workaround in jOOQ for the time being. Of course, it would be cool if this wasn't necessary �

Comment by Alan Schmidt [ 2021-05-05 ]

Here's another problem case, and the workaround I found for it:

select JSON_OBJECT(
	'questionid',question.questionid,
	'questiontypename',questiontype.questiontypename,
	'constraints',question.constraints,
	'txt',JSON_OBJECTAGG(
		question_txt.lang,JSON_OBJECT(
			'title',question_txt.title,
			'helpertext',question_txt.helpertext,
			'reporttag',question_txt.reporttag
		)
	)
) as payload
from question
left join questiontype
	on question.questiontypeid=questiontype.questiontypeid
left join question_txt
	on question.questionid=question_txt.questionid
where question.questionid=10

In this case, JSON_OBJECTAGG functions fine, but nothing can be done to unescape the JSON column, "constraints".

You can use JSON_EXTRACT, JSON_QUERY, any of the JSON_MERGEs – it doesn't matter.

The workaround that does work is to use a CTE to isolate the aggregate, and then use one of the other hacks on the result, like this:

WITH translation AS (
	SELECT JSON_OBJECTAGG(
		question_txt.lang,JSON_OBJECT(
			'title',question_txt.title,
			'helpertext',question_txt.helpertext,
			'reporttag',question_txt.reporttag
		)
	) as txt
	FROM question_txt
	WHERE question_txt.questionid=10
)
SELECT JSON_OBJECT(
	'questionid',question.questionid,
	'questiontypename',questiontype.questiontypename,
	'constraints',question.constraints,
	'txt',JSON_EXTRACT(translation.txt,'$')
) as payload
from question
cross join translation
left join questiontype
	on question.questiontypeid=questiontype.questiontypeid
where question.questionid=10

The hack is necessary on the translation text, but not on the "constraints" element itself.

Comment by L. Schwarz [ 2021-08-19 ]

I guess the folloging issue is the same problem (happened with mariadb 10.5.10):

delimiter ;;
create or replace procedure json_bug2() begin
  declare a, b json;
  set a = json_object(), b = json_object("kk", "vv");
  set a = json_set(a, "$.k", b);
  select a, json_type(a), b, json_type(b);
end;;
delimiter ;
call json_bug2();

Actual result:

+---------------------------+--------------+--------------+--------------+
| a                         | json_type(a) | b            | json_type(b) |
+---------------------------+--------------+--------------+--------------+
| {"k": "{\"kk\": \"vv\"}"} | OBJECT       | {"kk": "vv"} | OBJECT       |
+---------------------------+--------------+--------------+--------------+

Expected result:

+---------------------------+--------------+--------------+--------------+
| a                         | json_type(a) | b            | json_type(b) |
+---------------------------+--------------+--------------+--------------+
| {"k": {"kk": "vv"}}       | OBJECT       | {"kk": "vv"} | OBJECT       |
+---------------------------+--------------+--------------+--------------+

It happens with json_insert() and json_replace() as well and it also happens if the value of b is a json_array.

To be very clear: the bug is that the json value to set/insert/replace is obviously always stringified
rather than being treated as the json thing it is.

Until this get fixed, is there a workaround? I mean despite using mysql where this works as expected.

Comment by L. Schwarz [ 2021-08-19 ]

I found a workaround:

delimiter ;;
create procedure json_bug_workaround() begin
  declare a, b json;
  set a = json_object("k", "foo"), b = json_object('kk', 'vv');
  set a = json_replace(a, "$.k", json_extract(b,'$'));
  select a, json_type(a), b, json_type(b);
end;;
delimiter ;
call json_bug_workaround();

Results in what I expect:

+---------------------+--------------+--------------+--------------+
| a                   | json_type(a) | b            | json_type(b) |
+---------------------+--------------+--------------+--------------+
| {"k": {"kk": "vv"}} | OBJECT       | {"kk": "vv"} | OBJECT       |
+---------------------+--------------+--------------+--------------+

This means that json_extract( value, '$' ) is kind of casting the value to the json thing
it is, provided, it is valid json - but in case value is declared as json this is always the case.

If value is not valid json, then the result of this kind of cast is NULL for mariadb and
mysql will throw an error in this case - so be careful if you need to write your sql for both!

Comment by L. Schwarz [ 2021-08-26 ]

mariadb 10.5.10

CREATE FUNCTION t18(n int) RETURNS longtext return json_array(n, "workaround:", 0+n);
SELECT t18(7);
+-------------------------+
| t18(7)                  |
+-------------------------+
| ["7", "workaround:", 7] |
+-------------------------+

The json string "7" at index 0 is unexpected. It should be the integer 7 because the function parameter n is declared as integer.
As a workaround a numeric expression can be used just before passing n as json_array() parameter.

Comment by Ryan Leadenham [ 2021-08-27 ]

I'm guessing this bug affects JSON_TABLE() too. You can't put something in with double quotes and get it out the same as you put it in.

Suppose you're working with the string

The book's title is "Lord of the Rings"

SELECT 
*
FROM 
JSON_TABLE(
	'["The book\'s title is \"Lord of the Rings\""]',
	'$[*]' 
	COLUMNS (
		`text` TEXT PATH '$[0]'
	)
) AS t

Results in a syntax error

SELECT 
*
FROM 
JSON_TABLE(
	'["The book\'s title is \\\"Lord of the Rings\\\""]',
	'$[*]' 
	COLUMNS (
		`text` TEXT PATH '$[0]'
	)
) AS t

Returns
The book's title is \"Lord of the Rings\"

SELECT 
*
FROM 
JSON_TABLE(
	'[\"The book\'s title is \\\"Lord of the Rings\\\"\"]',
	'$[*]' 
	COLUMNS (
		`text` TEXT PATH '$[0]'
	)
) AS t

Also returns
The book's title is \"Lord of the Rings\"

I believe that last one is most syntactically correct

Comment by Ryan Leadenham [ 2021-09-21 ]

And even worse, this happens

SELECT 
*
FROM 
JSON_TABLE(
	'[{\"text\":\"The book\'s title is \\\"Lord of the Rings\\\"\"}]',
	'$[*]' 
	COLUMNS (
		`text` VARCHAR(39) PATH '$.text'
	)
) AS t

Returns the `text` field as:
The book's title is \"Lord of the Rings

(Yes, the trailing quote is missing)

Comment by Sergei Golubchik [ 2021-11-10 ]

The original bug report, in the issue Description, is not a bug. CONCAT() or GROUP_CONCAT() return a string, and when a string is put in a JSON it has to be properly quoted. One should use JSON_ARRAYAGG() or JSON_OBJECTAGG() for JSON.

Some of the cases in comments might be genuine bugs, they should be reported separately. For example, see MDEV-27018 as one of these separately reported cases.

Generated at Thu Feb 08 08:07:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.