Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11297

Add support for LIMIT clause in GROUP_CONCAT()

Details

    • 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2, 10.3.3-1

    Description

      GROUP_CONCAT supports ORDER BY and DISTINCT but not LIMIT clause.
      Supporting LIMIT N[,M] could simplify queries.

      something like

      SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC),",",1)

      could be written as

      GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC LIMIT 1)

      this can be used in timeseries-like calculations. Suppose we have a table (date,cnt) and we want to produce a list of dates, for each year, where cnt was maximum for that year, showing such date,cnt for each year

      currently this is written in an awkward manner and is also quite inefficient as we SELECT much more data that is necessary for the task:

      SELECT LEFT(col1,4) AS _yyyy
      , SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",DATE_FORMAT(col1,"%a %Y-%m-%d"),LPAD(col2,8," ")) ORDER BY col2 DESC),",",1) AS col3
      FROM t1
      GROUP BY 1
      ;

      Attachments

        Issue Links

          Activity

            varun Varun Gupta (Inactive) created issue -

            > Supporting LIMIT N[,M] could simplify queries,

            This is the goal of this MDEV.

            > avoid passing lots of data to the client
            This is already achieved by using SUBSTRING_INDEX

            > and may be optimized on the server side by not extracting complete GROUP BY ... ORDER BY result.

            This is outside of scope of this MDEV.

            Let's limit this task to providing LIMIT syntax, and query results.

            Loose-scan like optimization for reading only a few rows per group is theoretically possible, but hard to do in the current MariaDB's (or MySQL's) GROUP/ORDER BY optimizer.

            psergei Sergei Petrunia added a comment - > Supporting LIMIT N [,M] could simplify queries, This is the goal of this MDEV. > avoid passing lots of data to the client This is already achieved by using SUBSTRING_INDEX > and may be optimized on the server side by not extracting complete GROUP BY ... ORDER BY result. This is outside of scope of this MDEV. Let's limit this task to providing LIMIT syntax, and query results. Loose-scan like optimization for reading only a few rows per group is theoretically possible, but hard to do in the current MariaDB's (or MySQL's) GROUP/ORDER BY optimizer.
            varun Varun Gupta (Inactive) made changes -
            Field Original Value New Value
            Description GROUP_CONCAT supports ORDER BY and DISTINCT but not LIMIT clause.
            Supporting LIMIT N[,M] could simplify queries, avoid passing lots of data to the client and may be optimized on the server side by not extracting complete GROUP BY ... ORDER BY result.
            Example:

            something like

            SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC),",",1)

            could be written as

            GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC LIMIT 1)

            this can be used in timeseries-like calculations. Suppose we have a table (date,cnt) and we want to produce a list of dates, for each year, where cnt was maximum for that year, showing such date,cnt for each year

            currently this is written in an awkward manner and is also quite inefficient as we SELECT much more data that is necessary for the task:

            SELECT LEFT(col1,4) AS _yyyy
            , SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",DATE_FORMAT(col1,"%a %Y-%m-%d"),LPAD(col2,8," ")) ORDER BY col2 DESC),",",1) AS col3
            FROM t1
            GROUP BY 1
            ;
            GROUP_CONCAT supports ORDER BY and DISTINCT but not LIMIT clause.
            Supporting LIMIT N[,M] could simplify queries.

            something like

            SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC),",",1)

            could be written as

            GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC LIMIT 1)

            this can be used in timeseries-like calculations. Suppose we have a table (date,cnt) and we want to produce a list of dates, for each year, where cnt was maximum for that year, showing such date,cnt for each year

            currently this is written in an awkward manner and is also quite inefficient as we SELECT much more data that is necessary for the task:

            SELECT LEFT(col1,4) AS _yyyy
            , SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",DATE_FORMAT(col1,"%a %Y-%m-%d"),LPAD(col2,8," ")) ORDER BY col2 DESC),",",1) AS col3
            FROM t1
            GROUP BY 1
            ;
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-3 [ 115 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            varun Varun Gupta (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            varun Varun Gupta (Inactive) made changes -
            Summary add support for LIMIT clause in GROUP_CONCAT() Add support for LIMIT clause in GROUP_CONCAT()
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-3 [ 115 ] 10.2.4-4 [ 117 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            varun Varun Gupta (Inactive) added a comment - The progress can be tracked here: https://github.com/MariaDB/server/tree/bb-10.2-mdev11297
            varun Varun Gupta (Inactive) made changes -
            Assignee Varun [ varun ] Sergei Petrunia [ psergey ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            psergei Sergei Petrunia made changes -
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-4 [ 117 ] 10.2.4-4, 10.1.20 [ 117, 119 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher

            Review feedback was provided over email. Re-assigning to Varun as the ball is on his side now.

            psergei Sergei Petrunia added a comment - Review feedback was provided over email. Re-assigning to Varun as the ball is on his side now.
            psergei Sergei Petrunia made changes -
            Assignee Sergei Petrunia [ psergey ] Varun [ varun ]
            Status In Review [ 10002 ] Stalled [ 10000 ]

            Made changes according to the review, sent again for review to Sergei

            varun Varun Gupta (Inactive) added a comment - Made changes according to the review, sent again for review to Sergei
            varun Varun Gupta (Inactive) made changes -
            Assignee Varun [ varun ] Sergei Petrunia [ psergey ]
            serg Sergei Golubchik made changes -
            Priority Minor [ 4 ] Major [ 3 ]
            serg Sergei Golubchik made changes -
            Component/s Optimizer [ 10200 ]
            varun Varun Gupta (Inactive) made changes -
            Assignee Sergei Petrunia [ psergey ] Varun [ varun ]
            varun Varun Gupta (Inactive) made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            varun Varun Gupta (Inactive) made changes -
            varun Varun Gupta (Inactive) made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-4, 10.1.20 [ 117, 119 ] 10.2.4-4, 10.1.20, 5.5.54 [ 117, 119, 123 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher

            See also: MDEV-11563. It turns out, GROUP_CONCAT(DISTINCT x ORDER BY y) doesn't work. We've discovered that when discussing how DISTINCT ... ORDER BY ...LIMIT should be done.

            psergei Sergei Petrunia added a comment - See also: MDEV-11563 . It turns out, GROUP_CONCAT(DISTINCT x ORDER BY y) doesn't work. We've discovered that when discussing how DISTINCT ... ORDER BY ...LIMIT should be done.
            psergei Sergei Petrunia made changes -
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-4, 10.1.20, 5.5.54 [ 117, 119, 123 ] 10.2.4-4, 10.1.20, 10.2.4-5 [ 117, 119, 125 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            varun Varun Gupta (Inactive) made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            varun Varun Gupta (Inactive) made changes -
            Assignee Varun [ varun ] Sergei Petrunia [ psergey ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-4, 10.1.20, 10.2.4-5 [ 117, 119, 125 ] 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1 [ 117, 119, 125, 132 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked lower
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1 [ 117, 119, 125, 132 ] 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2 [ 117, 119, 125, 132, 134 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.3.0 [ 22127 ]
            serg Sergei Golubchik made changes -
            Sprint 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2 [ 117, 119, 125, 132, 134 ] 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2, 10.3.3-1 [ 117, 119, 125, 132, 134, 200 ]

            The branch was based on 10.2, so a rebase is done on top of 10.3.
            The code is currently pushed to the branch 10.3-varun

            varun Varun Gupta (Inactive) added a comment - The branch was based on 10.2, so a rebase is done on top of 10.3. The code is currently pushed to the branch 10.3-varun
            varun Varun Gupta (Inactive) made changes -
            Assignee Sergei Petrunia [ psergey ] Oleksandr Byelkin [ sanja ]
            varun Varun Gupta (Inactive) made changes -
            Assignee Oleksandr Byelkin [ sanja ] Varun Gupta [ varun ]
            varun Varun Gupta (Inactive) made changes -
            Status In Review [ 10002 ] Stalled [ 10000 ]
            varun Varun Gupta (Inactive) made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]

            Add yet another test like:

            set @x=-1;
            execute STMT using @x;
            

            and if it return error it is ok to push (if no let us talk)

            sanja Oleksandr Byelkin added a comment - Add yet another test like: set @x=-1; execute STMT using @x; and if it return error it is ok to push (if no let us talk)

            pushed the code to bb-10.3-varun

            varun Varun Gupta (Inactive) added a comment - pushed the code to bb-10.3-varun
            varun Varun Gupta (Inactive) made changes -
            Fix Version/s 10.3.3 [ 22644 ]
            Fix Version/s 10.3 [ 22126 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            antoonp A Prins added a comment -

            LIMIT clause in GROUP_CONCAT() does not work in a view.
            This can be tested using the example on https://mariadb.com/kb/en/library/group_concat/. If you create a view from the statement 'SELECT SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",dd,cc) ORDER BY cc DESC),",",1) FROM d;' than LIMIT does not do anything. Run the statement on its own and LIMIT works. I tested this in 10.3 and 10.4 docker container versions.
            Is that how it should be? I would like to be able to use LIMIT in a view.

            antoonp A Prins added a comment - LIMIT clause in GROUP_CONCAT() does not work in a view. This can be tested using the example on https://mariadb.com/kb/en/library/group_concat/ . If you create a view from the statement 'SELECT SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",dd,cc) ORDER BY cc DESC),",",1) FROM d;' than LIMIT does not do anything. Run the statement on its own and LIMIT works. I tested this in 10.3 and 10.4 docker container versions. Is that how it should be? I would like to be able to use LIMIT in a view.

            antoonp, thanks for reporting the bug, I have opened issue MDEV-18943 for this bug.

            varun Varun Gupta (Inactive) added a comment - antoonp , thanks for reporting the bug, I have opened issue MDEV-18943 for this bug.
            varun Varun Gupta (Inactive) made changes -
            varun Varun Gupta (Inactive) made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 78321 ] MariaDB v4 [ 133002 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 133093

            People

              varun Varun Gupta (Inactive)
              varun Varun Gupta (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.