Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25376

Inconsistent behavior of different JSON functions in regard to interpreting JSON document as invalid

    XMLWordPrintable

    Details

      Description

      In the test case below, the use of UNHEX function is unimportant for the scenario, it is there just to make it easy to copy-paste and use in different text editors. What hides in there is a structurally valid JSON document with a normal key, but with garbage symbols in the value (at least from the perspective of some character sets). Something like this:

      MariaDB [test]> SELECT UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A");
      +-----------------------------------------------+
      | UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") |
      +-----------------------------------------------+
      | { "a": "��" }
                                      |
      +-----------------------------------------------+
      1 row in set (0.000 sec)
      

      Please note though that it is not an artificial testing invention, it is an extract from an official JSON document published on the US gov website.

      Each of the following 4 queries uses it for the first argument of a JSON function, with conversion to an apparently incompatible character set.

      In MySQL, all of them behave the same way: they consider it a valid JSON document and treat it as such, additionally producing a warning "Invalid hebrew character string: 'BFBDEF'".

      In MariaDB however, every query behaves differently.

      • the first one says the document is invalid, no warnings;
      • the second one implies the document is invalid, produces a warning;
      • the third one produces an error.
      • the fourth one, with an impossible condition, returns an empty result set, without errors, warnings or any indication that we are dealing with an invalid document.

      SELECT JSON_VALID(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew));
      SELECT JSON_EXTRACT(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*');
      SELECT * FROM JSON_TABLE(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*' COLUMNS (o FOR ORDINALITY)) jt;
      SELECT * FROM JSON_TABLE(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*' COLUMNS (o FOR ORDINALITY)) jt WHERE 0;
      

      MariaDB 10.2 545cba13

      MariaDB [test]> SELECT JSON_VALID(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew));
      +---------------------------------------------------------------------------------+
      | JSON_VALID(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew)) |
      +---------------------------------------------------------------------------------+
      |                                                                               0 |
      +---------------------------------------------------------------------------------+
      1 row in set (0.001 sec)
      

      MariaDB 10.2 545cba13

      MariaDB [test]> SELECT JSON_EXTRACT(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*');
      +------------------------------------------------------------------------------------------+
      | JSON_EXTRACT(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*') |
      +------------------------------------------------------------------------------------------+
      | NULL                                                                                     |
      +------------------------------------------------------------------------------------------+
      1 row in set, 1 warning (0.001 sec)
      MariaDB [test]> show warnings;
      +---------+------+---------------------------------------------------------------------------+
      | Level   | Code | Message                                                                   |
      +---------+------+---------------------------------------------------------------------------+
      | Warning | 4035 | Broken JSON string in argument 1 to function 'json_extract' at position 9 |
      +---------+------+---------------------------------------------------------------------------+
      1 row in set (0.000 sec)
      

      MariaDB 10.6 JSON_TABLE development tree

      MariaDB [test]> SELECT * FROM JSON_TABLE(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*' COLUMNS (o FOR ORDINALITY)) jt;
      ERROR 4035 (HY000): Broken JSON string in argument 1 to function 'JSON_TABLE' at position 9
       
      MariaDB [test]> SELECT * FROM JSON_TABLE(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*' COLUMNS (o FOR ORDINALITY)) jt WHERE 0;
      Empty set (0.001 sec)
      

      In MySQL:

      MySQL 8.0.23

      MySQL [(none)]> SELECT JSON_VALID(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew));
      +---------------------------------------------------------------------------------+
      | JSON_VALID(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew)) |
      +---------------------------------------------------------------------------------+
      |                                                                               1 |
      +---------------------------------------------------------------------------------+
      1 row in set, 1 warning (0.000 sec)
       
      MySQL [(none)]> show warnings;
      +---------+------+-------------------------------------------+
      | Level   | Code | Message                                   |
      +---------+------+-------------------------------------------+
      | Warning | 1300 | Invalid hebrew character string: 'BFBDEF' |
      +---------+------+-------------------------------------------+
      1 row in set (0.000 sec)
       
      MySQL [(none)]> SELECT JSON_EXTRACT(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*');
      +------------------------------------------------------------------------------------------+
      | JSON_EXTRACT(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*') |
      +------------------------------------------------------------------------------------------+
      | ["ן?½ן?½"]                                                                               |
      +------------------------------------------------------------------------------------------+
      1 row in set, 1 warning (0.000 sec)
       
      MySQL [(none)]> show warnings;
      +---------+------+-------------------------------------------+
      | Level   | Code | Message                                   |
      +---------+------+-------------------------------------------+
      | Warning | 1300 | Invalid hebrew character string: 'BFBDEF' |
      +---------+------+-------------------------------------------+
      1 row in set (0.000 sec)
       
      MySQL [(none)]> SELECT * FROM JSON_TABLE(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*' COLUMNS (o FOR ORDINALITY)) jt;
      +------+
      | o    |
      +------+
      |    1 |
      +------+
      1 row in set, 1 warning (0.001 sec)
       
      MySQL [(none)]> show warnings;
      +---------+------+-------------------------------------------+
      | Level   | Code | Message                                   |
      +---------+------+-------------------------------------------+
      | Warning | 1300 | Invalid hebrew character string: 'BFBDEF' |
      +---------+------+-------------------------------------------+
      1 row in set (0.000 sec)
       
      MySQL [(none)]> SELECT * FROM JSON_TABLE(CONVERT(UNHEX("7B202261223A2022EFBFBDEFBFBD22207D0A") USING hebrew), '$.*' COLUMNS (o FOR ORDINALITY)) jt WHERE 0;
      Empty set, 1 warning (0.001 sec)
       
      MySQL [(none)]> show  warnings;
      +---------+------+-------------------------------------------+
      | Level   | Code | Message                                   |
      +---------+------+-------------------------------------------+
      | Warning | 1300 | Invalid hebrew character string: 'BFBDEF' |
      +---------+------+-------------------------------------------+
      1 row in set (0.000 sec)
      

        Attachments

          Activity

            People

            Assignee:
            holyfoot Alexey Botchkov
            Reporter:
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: