Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22011

DISTINCT with JSON_ARRAYAGG gives wrong results

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.5
    • 10.5.4
    • Optimizer
    • None

    Description

      As JSON_ARRAYAGG is implemented as a wrapper over Item_func_group_concat, MDEV-11563 is going to affect JSON_ARRAYAGG too.

      The problem with DISTINCT in JSON_ARRAYAGG is that it expects the values read from the table for each record. It does not use the key but instead calls val_str() for all the arguments in the GROUP_CONCAT function.

      Attachments

        Issue Links

          Activity

            Now let us port the patch for MDEV-11563 on 10.5 (it is in branch 10.5-varun). The commit is https://github.com/MariaDB/server/commit/f968feed977bf6d684347308a36b7083175d6f55

            and now running the above test

            MariaDB [test]> set tmp_table_size=1024;
            Query OK, 0 rows affected (0.001 sec)
             
            MariaDB [test]> select JSON_ARRAYAGG(distinct a) from t1;
            +---------------------------+
            | JSON_ARRAYAGG(distinct a) |
            +---------------------------+
            | ["0","1","2","3","4","5"] |
            +---------------------------+
            1 row in set (0.012 sec)
            

            I get the correct output.

            varun Varun Gupta (Inactive) added a comment - Now let us port the patch for MDEV-11563 on 10.5 (it is in branch 10.5-varun). The commit is https://github.com/MariaDB/server/commit/f968feed977bf6d684347308a36b7083175d6f55 and now running the above test MariaDB [test]> set tmp_table_size=1024; Query OK, 0 rows affected (0.001 sec)   MariaDB [test]> select JSON_ARRAYAGG(distinct a) from t1; +---------------------------+ | JSON_ARRAYAGG(distinct a) | +---------------------------+ | ["0","1","2","3","4","5"] | +---------------------------+ 1 row in set (0.012 sec) I get the correct output.

            But even with the above port, the NULLS are still not handled with DISTINCT

            CREATE TABLE t1(a INT, b INT);
            INSERT INTO t1 VALUES (1,1), (2,2), (3,3);
            INSERT INTO t1 VALUES (1,1), (2,2), (3,3);
            INSERT INTO t1 VALUES (NULL,NULL), (NULL,NULL);
            

            MariaDB [test]> SELECT JSON_ARRAYAGG(a) FROM t1;
            +-------------------------+
            | JSON_ARRAYAGG(a)        |
            +-------------------------+
            | [1,2,3,1,2,3,null,null] |
            +-------------------------+
            

            This is correct, now let try DISTINCT

            MariaDB [test]> SELECT JSON_ARRAYAGG(DISTINCT a) FROM t1;
            +---------------------------+
            | JSON_ARRAYAGG(DISTINCT a) |
            +---------------------------+
            | [null,null,null,null]     |
            +---------------------------+
            1 row in set (0.001 sec)
            

            So we get incorrect results with NULL value, even after the patch.
            The code according to me cannot handle NULL values right now for JSON_ARRAYAGG function

            varun Varun Gupta (Inactive) added a comment - But even with the above port, the NULLS are still not handled with DISTINCT CREATE TABLE t1(a INT, b INT); INSERT INTO t1 VALUES ( 1 , 1 ), ( 2 , 2 ), ( 3 , 3 ); INSERT INTO t1 VALUES ( 1 , 1 ), ( 2 , 2 ), ( 3 , 3 ); INSERT INTO t1 VALUES (NULL,NULL), (NULL,NULL); MariaDB [test]> SELECT JSON_ARRAYAGG(a) FROM t1; +-------------------------+ | JSON_ARRAYAGG(a) | +-------------------------+ | [1,2,3,1,2,3,null,null] | +-------------------------+ This is correct, now let try DISTINCT MariaDB [test]> SELECT JSON_ARRAYAGG(DISTINCT a) FROM t1; +---------------------------+ | JSON_ARRAYAGG(DISTINCT a) | +---------------------------+ | [null,null,null,null] | +---------------------------+ 1 row in set (0.001 sec) So we get incorrect results with NULL value, even after the patch. The code according to me cannot handle NULL values right now for JSON_ARRAYAGG function

            Patches are pushed to the branch 10.5-varun2.
            It contains the patch for both MDEV-22011 and MDEV-22840.

            varun Varun Gupta (Inactive) added a comment - Patches are pushed to the branch 10.5-varun2. It contains the patch for both MDEV-22011 and MDEV-22840 .
            varun Varun Gupta (Inactive) added a comment - Patch https://github.com/MariaDB/server/commit/c127f5021ccc7cebc09aebfdbe841e630993f12d

            Review input:
            https://lists.launchpad.net/maria-developers/msg12275.html

            Ok to push after addressed.

            psergei Sergei Petrunia added a comment - Review input: https://lists.launchpad.net/maria-developers/msg12275.html Ok to push after addressed.

            People

              varun Varun Gupta (Inactive)
              varun Varun Gupta (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.