[MDEV-27346] JSON_TABLE() does not handle quoting of strings correctly Created: 2021-12-22  Updated: 2022-01-04

Status: Open
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.6.4
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Ryan Leadenham Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: JSON, JSON_TABLE
Environment:

windows and linux


Issue Links:
Relates
relates to MDEV-27412 JSON_TABLE doesn't properly unquote s... Open

 Description   

There is no way to quote a string with quotes in it correctly.

Suppose you're working with the string:

The book's title is "Lord of the Rings"

SELECT 
*
FROM 
JSON_TABLE(
	'["The book\'s title is \"Lord of the Rings\""]',
	'$[*]' 
	COLUMNS (
		`text` TEXT PATH '$[0]'
	)
) AS t

Results in a syntax error

SELECT 
*
FROM 
JSON_TABLE(
	'["The book\'s title is \\\"Lord of the Rings\\\""]',
	'$[*]' 
	COLUMNS (
		`text` TEXT PATH '$[0]'
	)
) AS t

Returns
The book's title is \"Lord of the Rings\"

SELECT 
*
FROM 
JSON_TABLE(
	'[\"The book\'s title is \\\"Lord of the Rings\\\"\"]',
	'$[*]' 
	COLUMNS (
		`text` TEXT PATH '$[0]'
	)
) AS t

Also returns
The book's title is \"Lord of the Rings\"

I believe the previous one is most syntactically correct

And even worse, this happens

SELECT 
*
FROM 
JSON_TABLE(
	'[{\"text\":\"The book\'s title is \\\"Lord of the Rings\\\"\"}]',
	'$[*]' 
	COLUMNS (
		`text` VARCHAR(39) PATH '$.text'
	)
) AS t

Returns the `text` field as:
The book's title is \"Lord of the Rings

(Yes, the trailing quote is missing)


It seems like this is somewhat related to other JSON related bugs such as MDEV-13701 and MDEV-27018. Ultimately it seems like there is weirdness with how "strings" and "json" data are handled internally. For example a traditional string vs a json string. From a developer's point of view, if I have a string, I don't care whether it's JSON or not (other than escaping). When I get it back in my output, I need it in it's raw format with no escaping chars.... doesn't matter if it's a text field in a normal select or a property of a JSON object (I still expect to have to decode the JSON object normally, but once that's done on the top level object all sub properties should be in their raw states). It seems like this would be the case too internally. Sometimes a JSON strings need switched out for raw strings and vice versa. It shouldn't be up to the developer because one of the big selling points of JSON is that it's data of unknown shape. So I don't always know if something is a string or not ahead of time. But I digress. In the examples above, as far as I can tell, there is no way to formulate a query to put the data in a usable form.



 Comments   
Comment by Louis Brauer [ 2022-01-03 ]

I have the same trouble with the `JSON_TABLE` function in *MariaDB 10.6.5*. This is my query:

SET @data = '[

{"Data": "<root language=\\"de\\"></root>"}

]';

SELECT
data
FROM JSON_TABLE (@data, '$[*]' COLUMNS (data text PATH '$.Data')) AS t;

What I get back is the XML string, but not properly unescaped:

<root language=\"de\"></root>

When I use then `JSON_VALUE` instead:

SET @data = '[

{"Data": "<root language=\\"de\\"></root>"}

]';

SELECT
JSON_VALUE(@DATA, '$[0].Data');

then I get back the correctly unescaped string:

<root language="de"></root>

When I do the same on a MySQL *8.0.26* server, it works as expected:

SET @data = '[

{"Data": "<root language=\\"de\\"></root>"}

]';

SELECT
data
FROM JSON_TABLE (@data, '$[*]' COLUMNS (data text PATH '$.Data')) AS t;

– correct:
– <root language="de"></root>

SELECT
JSON_VALUE(@data, '$[0].Data');

– correct:
– <root language="de"></root>

It looks to me that this difference in behaviour is a bug in MariaDB's side.

Generated at Thu Feb 08 09:52:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.