Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16443

JSON can not be implemented as a alias for longtext

Details

    Description

      JSON support.

      1. json_object('name', json_column )
      2. json_object('name',longtext_column)

      These two SHOULD give different results, because the longtext_column is a string so it gets quoted and quotes inside it get escaped, whereas the json_column is SEMANTICALLY a not necessarily a string and should not get quoted when it is an json object or json array.

      Mariadb's support for JSON uses 'json' as a alias for longtext and therefore produces the same result for 1 & 2.

      I know you are going to close this as a duplicate! BUT every discussion I have started or seen with MariaDB folk gets the answer "Our tests show that there is no performance difference between storing json internally as binary or a string", which has NOTHING to do with the problem. [I'd prefer you remain file-compatible with MySQL, but the internal representation is INTERNAL]. The problem here is that the JSON SEMANTICS.

      Isn't it possible for you to implement the semantics correctly? (independently of your decision on internal representation). It would require functions like json_object to be sensitive to whether a column or expression is typed "json".

      (I have a ton of procedures and functions that use JSON on MySQL5.7 and a ton of servers that I'd like to get to MariaDB), but I cant because if this problem. You have so much more good stuff I want to be able to use!)

      Attachments

        Issue Links

          Activity

            You are right, "no performance difference" has nothing to do with this issue. And we never said that we didn't create JSON type because there was "no performance difference". The internal representation is plain text because there was no performance difference, but I agree, internal representation is internal.

            The reason why we don't have a true JSON data type is because SQL Standard explicitly forbids it. In the Part 2 "Foundation" of SQL:2016, in the Section 4.46 "JSON data handling in SQL" it says unambiguously:

            NOTE 116 — There is no SQL <data type> whose value space is SQL/JSON items or SQL/JSON sequences.

            Note that Oracle, DB2, and SQL Server follow the standard here. For example, here's the manual for the JSON_QUERY function:
            Oracle:

            For expr, specify an expression that evaluates to a text literal. If expr is a column, then the column must be of data type VARCHAR2, CLOB, or BLOB. If expr is not a text literal of well-formed JSON data using strict or lax syntax, then the function returns null by default.

            DB2:

            An expression that returns a value that is a built-in string data type. If a character or graphic value is returned, it must contain correctly formatted JSON data.

            SQLServer:

            An expression. Typically the name of a variable or a column that contains JSON text. If JSON_QUERY finds JSON that is not valid in expression before it finds the value identified by path, the function returns an error.

            SQL Server manual doesn't say it must be a text literal, but, like others, it says what happens when the expression is not a valid JSON, which implies it's a text literal, and not a specialized type


            So, we've made a conscious decision to follow the SQL Standard (like Oracle, DB2, SQL Server, and others).

            But at the moment not everything from the SQL Standard is implemented yet. The Standard specifies that one can optionally specify FORMAT JSON after the value, to specify that it is JSON, not an arbitrary text. In your example it would've been json_object('name',json_column FORMAT JSON) vs json_object('name', longtext_column).

            We did not implement it yet, but exactly the same functionality can be achieved with our JSON formatting functions. Compare:

            MariaDB [test]> select json_object('foo', '{"bar": 5}') as_text;
            +-------------------------+
            | as_text                 |
            +-------------------------+
            | {"foo": "{\"bar\": 5}"} |
            +-------------------------+
            1 row in set (0.001 sec)
             
            MariaDB [test]> select json_object('foo',json_compact('{"bar": 5}')) as_json;
            +---------------------+
            | as_json             |
            +---------------------+
            | {"foo": {"bar": 5}} |
            +---------------------+
            1 row in set (0.001 sec)
            

            serg Sergei Golubchik added a comment - You are right, "no performance difference" has nothing to do with this issue. And we never said that we didn't create JSON type because there was "no performance difference". The internal representation is plain text because there was no performance difference, but I agree, internal representation is internal. The reason why we don't have a true JSON data type is because SQL Standard explicitly forbids it. In the Part 2 "Foundation" of SQL:2016, in the Section 4.46 "JSON data handling in SQL" it says unambiguously: NOTE 116 — There is no SQL <data type> whose value space is SQL/JSON items or SQL/JSON sequences. Note that Oracle, DB2, and SQL Server follow the standard here. For example, here's the manual for the JSON_QUERY function: Oracle : For expr , specify an expression that evaluates to a text literal. If expr is a column, then the column must be of data type VARCHAR2 , CLOB , or BLOB . If expr is not a text literal of well-formed JSON data using strict or lax syntax, then the function returns null by default. DB2 : An expression that returns a value that is a built-in string data type. If a character or graphic value is returned, it must contain correctly formatted JSON data. SQLServer : An expression. Typically the name of a variable or a column that contains JSON text. If JSON_QUERY finds JSON that is not valid in expression before it finds the value identified by path, the function returns an error. SQL Server manual doesn't say it must be a text literal, but, like others, it says what happens when the expression is not a valid JSON, which implies it's a text literal, and not a specialized type So, we've made a conscious decision to follow the SQL Standard (like Oracle, DB2, SQL Server, and others). But at the moment not everything from the SQL Standard is implemented yet. The Standard specifies that one can optionally specify FORMAT JSON after the value, to specify that it is JSON, not an arbitrary text. In your example it would've been json_object('name',json_column FORMAT JSON) vs json_object('name', longtext_column) . We did not implement it yet, but exactly the same functionality can be achieved with our JSON formatting functions. Compare: MariaDB [test]> select json_object('foo', '{"bar": 5}') as_text; +-------------------------+ | as_text | +-------------------------+ | {"foo": "{\"bar\": 5}"} | +-------------------------+ 1 row in set (0.001 sec)   MariaDB [test]> select json_object('foo',json_compact('{"bar": 5}')) as_json; +---------------------+ | as_json | +---------------------+ | {"foo": {"bar": 5}} | +---------------------+ 1 row in set (0.001 sec)
            davep Dave Pullin added a comment -

            Thanks for the explanation. [I might argue that the standard is dumb and/or the mariadb needs to upwards compatible with mysql, but that's not getting me anywhere!]

            This means that everywhere a column/expression is, or might be, json it must be written by json_compact( expr ); [which is tedious at best, and wrong for some expr types, eg strings that are not json.]

            It also means the return type of json_compact is unique ... it is not a string. It is a string that json_object knows not to quote. (it's a JSON datatype! A string with special treatments).

            So how do I create a function that returns the datatype that json_compact returns?

            >select json_object("z",json_compact("string")) as result\G
            result:

            {"z": string}

            > select json_object("z","string") as result\G
            result:

            {"z": "string"}

            > select json_object("z",json_compact("{}")) as result\G
            result: {"z": {}}

            > select json_object("z","{}") as result\G
            result: {"z": "{}"}

            > create or replace function json_test() returns json return json_object("thisa","atest");
            > create or replace function json_testc() returns json return json_compact(json_object("thisa","atest"));

            >select json_object("x",json_testc()) as result\G
            result: {"x": "

            {\"thisa\":\"atest\"}

            "}

            > select json_object("x",json_test()) as result\G
            result: {"x": "

            {\"thisa\": \"atest\"}

            "}

            davep Dave Pullin added a comment - Thanks for the explanation. [I might argue that the standard is dumb and/or the mariadb needs to upwards compatible with mysql, but that's not getting me anywhere!] This means that everywhere a column/expression is, or might be, json it must be written by json_compact( expr ); [which is tedious at best, and wrong for some expr types, eg strings that are not json.] It also means the return type of json_compact is unique ... it is not a string. It is a string that json_object knows not to quote. (it's a JSON datatype! A string with special treatments). So how do I create a function that returns the datatype that json_compact returns? >select json_object("z",json_compact("string")) as result\G result: {"z": string} > select json_object("z","string") as result\G result: {"z": "string"} > select json_object("z",json_compact("{}")) as result\G result: {"z": {}} > select json_object("z","{}") as result\G result: {"z": "{}"} > create or replace function json_test() returns json return json_object("thisa","atest"); > create or replace function json_testc() returns json return json_compact(json_object("thisa","atest")); >select json_object("x",json_testc()) as result\G result: {"x": " {\"thisa\":\"atest\"} "} > select json_object("x",json_test()) as result\G result: {"x": " {\"thisa\": \"atest\"} "}
            serg Sergei Golubchik added a comment - - edited

            I personally think that was not the smartest part of the standard. Or the standard committee had some good reasons that I don't know about, and I'd like to know them...

            Anyway, I cannot run any tests on the standard, obviously, so I tried that in Oracle instead (Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production Version 18.3.0.0.0). It seems that in Oracle JSON_OBJECT also returns something special, which is not a string:

            select json_object('x' is '{"foo" : "bar"}') from dual;
            → {"x" : "{"foo" : "bar"}"}
            select json_object('x' is json_object('foo' is 'bar')) from dual;
            → {"x" : {"foo" : "bar"}}
            select json_object('x' is '[' || json_object('foo' is 'bar') || ']') from dual
            → {"x" : "[{"foo":"bar"}]"}
            

            See how the string in the first statement is used as a string inside JSON object (with incorrect quoting even!), but a json object in the second statement is used as a json object. And using string concatenation on it makes it a normal string again.

            And a stored function doesn't return anything special in Oracle:

            create function foo return varchar as
              a varchar2(100);
            begin
              select json_object('foo' is 'bar') into a from dual;
              return a;
            end;
             
            select json_object('x' is foo) from dual;
            → {"x" : "{"foo":"bar"}"}
            

            Of course, saying json_object('x' is foo format json) works as expected, but so far I didn't find how to return it from a stored function.

            Still looking...

            serg Sergei Golubchik added a comment - - edited I personally think that was not the smartest part of the standard. Or the standard committee had some good reasons that I don't know about, and I'd like to know them... Anyway, I cannot run any tests on the standard, obviously, so I tried that in Oracle instead (Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production Version 18.3.0.0.0). It seems that in Oracle JSON_OBJECT also returns something special, which is not a string: select json_object( 'x' is '{"foo" : "bar"}' ) from dual; → { "x" : "{" foo " : " bar "}" } select json_object( 'x' is json_object( 'foo' is 'bar' )) from dual; → { "x" : { "foo" : "bar" }} select json_object( 'x' is '[' || json_object( 'foo' is 'bar' ) || ']' ) from dual → { "x" : "[{" foo ":" bar "}]" } See how the string in the first statement is used as a string inside JSON object (with incorrect quoting even!), but a json object in the second statement is used as a json object. And using string concatenation on it makes it a normal string again. And a stored function doesn't return anything special in Oracle: create function foo return varchar as a varchar2(100); begin select json_object( 'foo' is 'bar' ) into a from dual; return a; end ;   select json_object( 'x' is foo) from dual; → { "x" : "{" foo ":" bar "}" } Of course, saying json_object('x' is foo format json) works as expected, but so far I didn't find how to return it from a stored function. Still looking...
            davep Dave Pullin added a comment -

            It would seem to a be a legitimate extension to the standard to implement "format json" as a character column/expression qualifier

            create table .. ( my_json longtext format json )

            create function x() returns longtext format json ...

            and then to retroactively redefine 'json' as a [non standard] alias for "longtext format json"

            davep Dave Pullin added a comment - It would seem to a be a legitimate extension to the standard to implement "format json" as a character column/expression qualifier create table .. ( my_json longtext format json ) create function x() returns longtext format json ... and then to retroactively redefine 'json' as a [non standard] alias for "longtext format json"

            Yes, it's an interesting idea. It practically creates a new type, but it kind of doesn't create a new type... Nice.

            serg Sergei Golubchik added a comment - Yes, it's an interesting idea. It practically creates a new type, but it kind of doesn't create a new type... Nice.

            I think this was meanwhile implemented:

            MariaDB [test]> create table t1 (a text, b json);
            Query OK, 0 rows affected (0.025 sec)
             
            MariaDB [test]> insert t1 values ('{"a":1}', '{"a":1}');
            Query OK, 1 row affected (0.003 sec)
             
            MariaDB [test]> select * from t1;
            +---------+---------+
            | a       | b       |
            +---------+---------+
            | {"a":1} | {"a":1} |
            +---------+---------+
            1 row in set (0.001 sec)
             
            MariaDB [test]> select json_object('name', a), json_object('name', b) from t1;
            +------------------------+------------------------+
            | json_object('name', a) | json_object('name', b) |
            +------------------------+------------------------+
            | {"name": "{\"a\":1}"}  | {"name": {"a":1}}      |
            +------------------------+------------------------+
            1 row in set (0.002 sec)
            

            serg Sergei Golubchik added a comment - I think this was meanwhile implemented: MariaDB [test]> create table t1 (a text, b json); Query OK, 0 rows affected (0.025 sec)   MariaDB [test]> insert t1 values ('{"a":1}', '{"a":1}'); Query OK, 1 row affected (0.003 sec)   MariaDB [test]> select * from t1; +---------+---------+ | a | b | +---------+---------+ | {"a":1} | {"a":1} | +---------+---------+ 1 row in set (0.001 sec)   MariaDB [test]> select json_object('name', a), json_object('name', b) from t1; +------------------------+------------------------+ | json_object('name', a) | json_object('name', b) | +------------------------+------------------------+ | {"name": "{\"a\":1}"} | {"name": {"a":1}} | +------------------------+------------------------+ 1 row in set (0.002 sec)

            People

              holyfoot Alexey Botchkov
              davep Dave Pullin
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.