[MDEV-11297] Add support for LIMIT clause in GROUP_CONCAT() Created: 2016-11-16  Updated: 2020-08-25  Resolved: 2017-12-08

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: 10.3.3

Type: Task Priority: Major
Reporter: Varun Gupta (Inactive) Assignee: Varun Gupta (Inactive)
Resolution: Fixed Votes: 2
Labels: None

Issue Links:
Duplicate
duplicates MDEV-5556 LIMIT clause in GROUP_CONCAT function Closed
Relates
relates to MDEV-5556 LIMIT clause in GROUP_CONCAT function Closed
relates to MDEV-11563 GROUP_CONCAT(DISTINCT ...) may produ... Closed
Sprint: 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2, 10.3.3-1

 Description   

GROUP_CONCAT supports ORDER BY and DISTINCT but not LIMIT clause.
Supporting LIMIT N[,M] could simplify queries.

something like

SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC),",",1)

could be written as

GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC LIMIT 1)

this can be used in timeseries-like calculations. Suppose we have a table (date,cnt) and we want to produce a list of dates, for each year, where cnt was maximum for that year, showing such date,cnt for each year

currently this is written in an awkward manner and is also quite inefficient as we SELECT much more data that is necessary for the task:

SELECT LEFT(col1,4) AS _yyyy
, SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",DATE_FORMAT(col1,"%a %Y-%m-%d"),LPAD(col2,8," ")) ORDER BY col2 DESC),",",1) AS col3
FROM t1
GROUP BY 1
;



 Comments   
Comment by Sergei Petrunia [ 2016-11-17 ]

> Supporting LIMIT N[,M] could simplify queries,

This is the goal of this MDEV.

> avoid passing lots of data to the client
This is already achieved by using SUBSTRING_INDEX

> and may be optimized on the server side by not extracting complete GROUP BY ... ORDER BY result.

This is outside of scope of this MDEV.

Let's limit this task to providing LIMIT syntax, and query results.

Loose-scan like optimization for reading only a few rows per group is theoretically possible, but hard to do in the current MariaDB's (or MySQL's) GROUP/ORDER BY optimizer.

Comment by Varun Gupta (Inactive) [ 2016-11-28 ]

The progress can be tracked here:

https://github.com/MariaDB/server/tree/bb-10.2-mdev11297

Comment by Sergei Petrunia [ 2016-12-02 ]

Review feedback was provided over email. Re-assigning to Varun as the ball is on his side now.

Comment by Varun Gupta (Inactive) [ 2016-12-03 ]

Made changes according to the review, sent again for review to Sergei

Comment by Sergei Petrunia [ 2016-12-15 ]

See also: MDEV-11563. It turns out, GROUP_CONCAT(DISTINCT x ORDER BY y) doesn't work. We've discovered that when discussing how DISTINCT ... ORDER BY ...LIMIT should be done.

Comment by Varun Gupta (Inactive) [ 2017-11-26 ]

The branch was based on 10.2, so a rebase is done on top of 10.3.
The code is currently pushed to the branch 10.3-varun

Comment by Oleksandr Byelkin [ 2017-12-07 ]

Add yet another test like:

set @x=-1;
execute STMT using @x;

and if it return error it is ok to push (if no let us talk)

Comment by Varun Gupta (Inactive) [ 2017-12-07 ]

pushed the code to bb-10.3-varun

Comment by A Prins [ 2019-03-15 ]

LIMIT clause in GROUP_CONCAT() does not work in a view.
This can be tested using the example on https://mariadb.com/kb/en/library/group_concat/. If you create a view from the statement 'SELECT SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",dd,cc) ORDER BY cc DESC),",",1) FROM d;' than LIMIT does not do anything. Run the statement on its own and LIMIT works. I tested this in 10.3 and 10.4 docker container versions.
Is that how it should be? I would like to be able to use LIMIT in a view.

Comment by Varun Gupta (Inactive) [ 2019-03-15 ]

antoonp, thanks for reporting the bug, I have opened issue MDEV-18943 for this bug.

Generated at Thu Feb 08 07:48:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.