Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16869

String functions don't respect character set of JSON_VALUE

    XMLWordPrintable

Details

    Description

      Let's say that we have the following table:

      CREATE TABLE json_test (
      json_col TEXT
      ) DEFAULT CHARSET=latin1;
       
      INSERT INTO json_test VALUES ('{"key1":"Sí"}');
      

      We can get the value of key1 by executing the following:

      SELECT JSON_VALUE(json_col, '$.key1') FROM json_test;
      

      And this works fine:

      MariaDB [db1]> SELECT JSON_VALUE(json_col, '$.key1') FROM json_test;
      +--------------------------------+
      | JSON_VALUE(json_col, '$.key1') |
      +--------------------------------+
      | Sí                             |
      +--------------------------------+
      1 row in set (0.00 sec)
      

      But if we combine this with string functions, then the character set gets messed up. For example, run the following queries instead:

      SELECT REPLACE(JSON_VALUE(json_col, '$.key1'), 'null', '') FROM json_test;
      SELECT CONCAT(json_value(json_col, '$.key1'), 'a') FROM json_test;
      

      And we get something unexpected:

      MariaDB [db1]> SELECT REPLACE(JSON_VALUE(json_col, '$.key1'), 'null', '') FROM json_test;
      +-----------------------------------------------------+
      | REPLACE(JSON_VALUE(json_col, '$.key1'), 'null', '') |
      +-----------------------------------------------------+
      | Sí                                                 |
      +-----------------------------------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [db1]> SELECT CONCAT(json_value(json_col, '$.key1'), 'a') FROM json_test;
      +---------------------------------------------+
      | CONCAT(json_value(json_col, '$.key1'), 'a') |
      +---------------------------------------------+
      | Sía                                        |
      +---------------------------------------------+
      1 row in set (0.00 sec)
      

      We can workaround this problem by explicitly converting the JSON_VALUE return value to a given character set:

      MariaDB [db1]> SELECT REPLACE(CONVERT(JSON_VALUE(json_col, '$.key1') USING utf8), 'null', '') FROM json_test;
      +-------------------------------------------------------------------------+
      | REPLACE(CONVERT(JSON_VALUE(json_col, '$.key1') USING utf8), 'null', '') |
      +-------------------------------------------------------------------------+
      | Sí                                                                      |
      +-------------------------------------------------------------------------+
      1 row in set (0.00 sec)
      

      Attachments

        Activity

          People

            holyfoot Alexey Botchkov
            GeoffMontee Geoff Montee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.