[MDEV-16443] JSON can not be implemented as a alias for longtext Created: 2018-06-08 Updated: 2021-08-14 Resolved: 2021-08-14 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data types, JSON |
| Fix Version/s: | N/A |
| Type: | Task | Priority: | Major |
| Reporter: | Dave Pullin | Assignee: | Alexey Botchkov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | json | ||
| Issue Links: |
|
||||||||||||
| Epic Link: | New data types | ||||||||||||
| Description |
|
JSON support. 1. json_object('name', json_column ) These two SHOULD give different results, because the longtext_column is a string so it gets quoted and quotes inside it get escaped, whereas the json_column is SEMANTICALLY a not necessarily a string and should not get quoted when it is an json object or json array. Mariadb's support for JSON uses 'json' as a alias for longtext and therefore produces the same result for 1 & 2. I know you are going to close this as a duplicate! BUT every discussion I have started or seen with MariaDB folk gets the answer "Our tests show that there is no performance difference between storing json internally as binary or a string", which has NOTHING to do with the problem. [I'd prefer you remain file-compatible with MySQL, but the internal representation is INTERNAL]. The problem here is that the JSON SEMANTICS. Isn't it possible for you to implement the semantics correctly? (independently of your decision on internal representation). It would require functions like json_object to be sensitive to whether a column or expression is typed "json". (I have a ton of procedures and functions that use JSON on MySQL5.7 and a ton of servers that I'd like to get to MariaDB), but I cant because if this problem. You have so much more good stuff I want to be able to use!) |
| Comments |
| Comment by Sergei Golubchik [ 2018-06-08 ] | |||||||||||||||||||||
|
You are right, "no performance difference" has nothing to do with this issue. And we never said that we didn't create JSON type because there was "no performance difference". The internal representation is plain text because there was no performance difference, but I agree, internal representation is internal. The reason why we don't have a true JSON data type is because SQL Standard explicitly forbids it. In the Part 2 "Foundation" of SQL:2016, in the Section 4.46 "JSON data handling in SQL" it says unambiguously:
Note that Oracle, DB2, and SQL Server follow the standard here. For example, here's the manual for the JSON_QUERY function:
DB2:
SQL Server manual doesn't say it must be a text literal, but, like others, it says what happens when the expression is not a valid JSON, which implies it's a text literal, and not a specialized type So, we've made a conscious decision to follow the SQL Standard (like Oracle, DB2, SQL Server, and others). But at the moment not everything from the SQL Standard is implemented yet. The Standard specifies that one can optionally specify FORMAT JSON after the value, to specify that it is JSON, not an arbitrary text. In your example it would've been json_object('name',json_column FORMAT JSON) vs json_object('name', longtext_column). We did not implement it yet, but exactly the same functionality can be achieved with our JSON formatting functions. Compare:
| |||||||||||||||||||||
| Comment by Dave Pullin [ 2018-06-09 ] | |||||||||||||||||||||
|
Thanks for the explanation. [I might argue that the standard is dumb and/or the mariadb needs to upwards compatible with mysql, but that's not getting me anywhere!] This means that everywhere a column/expression is, or might be, json it must be written by json_compact( expr ); [which is tedious at best, and wrong for some expr types, eg strings that are not json.] It also means the return type of json_compact is unique ... it is not a string. It is a string that json_object knows not to quote. (it's a JSON datatype! A string with special treatments). So how do I create a function that returns the datatype that json_compact returns? >select json_object("z",json_compact("string")) as result\G > select json_object("z","string") as result\G > select json_object("z",json_compact("{}")) as result\G > select json_object("z","{}") as result\G > create or replace function json_test() returns json return json_object("thisa","atest"); >select json_object("x",json_testc()) as result\G "} > select json_object("x",json_test()) as result\G "} | |||||||||||||||||||||
| Comment by Sergei Golubchik [ 2018-06-10 ] | |||||||||||||||||||||
|
I personally think that was not the smartest part of the standard. Or the standard committee had some good reasons that I don't know about, and I'd like to know them... Anyway, I cannot run any tests on the standard, obviously, so I tried that in Oracle instead (Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production Version 18.3.0.0.0). It seems that in Oracle JSON_OBJECT also returns something special, which is not a string:
See how the string in the first statement is used as a string inside JSON object (with incorrect quoting even!), but a json object in the second statement is used as a json object. And using string concatenation on it makes it a normal string again. And a stored function doesn't return anything special in Oracle:
Of course, saying json_object('x' is foo format json) works as expected, but so far I didn't find how to return it from a stored function. Still looking... | |||||||||||||||||||||
| Comment by Dave Pullin [ 2018-06-10 ] | |||||||||||||||||||||
|
It would seem to a be a legitimate extension to the standard to implement "format json" as a character column/expression qualifier create table .. ( my_json longtext format json ) create function x() returns longtext format json ... and then to retroactively redefine 'json' as a [non standard] alias for "longtext format json" | |||||||||||||||||||||
| Comment by Sergei Golubchik [ 2018-07-09 ] | |||||||||||||||||||||
|
Yes, it's an interesting idea. It practically creates a new type, but it kind of doesn't create a new type... Nice. | |||||||||||||||||||||
| Comment by Sergei Golubchik [ 2021-08-14 ] | |||||||||||||||||||||
|
I think this was meanwhile implemented:
|