Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2, 10.3.3-1
Description
GROUP_CONCAT supports ORDER BY and DISTINCT but not LIMIT clause.
Supporting LIMIT N[,M] could simplify queries.
something like
SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC),",",1)
could be written as
GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC LIMIT 1)
this can be used in timeseries-like calculations. Suppose we have a table (date,cnt) and we want to produce a list of dates, for each year, where cnt was maximum for that year, showing such date,cnt for each year
currently this is written in an awkward manner and is also quite inefficient as we SELECT much more data that is necessary for the task:
SELECT LEFT(col1,4) AS _yyyy
, SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",DATE_FORMAT(col1,"%a %Y-%m-%d"),LPAD(col2,8," ")) ORDER BY col2 DESC),",",1) AS col3
FROM t1
GROUP BY 1
;
Attachments
Issue Links
- duplicates
-
MDEV-5556 LIMIT clause in GROUP_CONCAT function
-
- Closed
-
- relates to
-
MDEV-5556 LIMIT clause in GROUP_CONCAT function
-
- Closed
-
-
MDEV-11563 GROUP_CONCAT(DISTINCT ...) may produce a non-distinct list
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Description |
GROUP_CONCAT supports ORDER BY and DISTINCT but not LIMIT clause.
Supporting LIMIT N[,M] could simplify queries, avoid passing lots of data to the client and may be optimized on the server side by not extracting complete GROUP BY ... ORDER BY result. Example: something like SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC),",",1) could be written as GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC LIMIT 1) this can be used in timeseries-like calculations. Suppose we have a table (date,cnt) and we want to produce a list of dates, for each year, where cnt was maximum for that year, showing such date,cnt for each year currently this is written in an awkward manner and is also quite inefficient as we SELECT much more data that is necessary for the task: SELECT LEFT(col1,4) AS _yyyy , SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",DATE_FORMAT(col1,"%a %Y-%m-%d"),LPAD(col2,8," ")) ORDER BY col2 DESC),",",1) AS col3 FROM t1 GROUP BY 1 ; |
GROUP_CONCAT supports ORDER BY and DISTINCT but not LIMIT clause.
Supporting LIMIT N[,M] could simplify queries. something like SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC),",",1) could be written as GROUP_CONCAT(CONCAT_WS(":",date,cnt) ORDER BY cnt DESC LIMIT 1) this can be used in timeseries-like calculations. Suppose we have a table (date,cnt) and we want to produce a list of dates, for each year, where cnt was maximum for that year, showing such date,cnt for each year currently this is written in an awkward manner and is also quite inefficient as we SELECT much more data that is necessary for the task: SELECT LEFT(col1,4) AS _yyyy , SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",DATE_FORMAT(col1,"%a %Y-%m-%d"),LPAD(col2,8," ")) ORDER BY col2 DESC),",",1) AS col3 FROM t1 GROUP BY 1 ; |
Sprint | 10.2.4-3 [ 115 ] |
Rank | Ranked higher |
Status | Open [ 1 ] | In Progress [ 3 ] |
Summary | add support for LIMIT clause in GROUP_CONCAT() | Add support for LIMIT clause in GROUP_CONCAT() |
Sprint | 10.2.4-3 [ 115 ] | 10.2.4-4 [ 117 ] |
Rank | Ranked higher |
Assignee | Varun [ varun ] | Sergei Petrunia [ psergey ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Sprint | 10.2.4-4 [ 117 ] | 10.2.4-4, 10.1.20 [ 117, 119 ] |
Rank | Ranked higher |
Assignee | Sergei Petrunia [ psergey ] | Varun [ varun ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Assignee | Varun [ varun ] | Sergei Petrunia [ psergey ] |
Priority | Minor [ 4 ] | Major [ 3 ] |
Component/s | Optimizer [ 10200 ] |
Assignee | Sergei Petrunia [ psergey ] | Varun [ varun ] |
Status | Stalled [ 10000 ] | In Progress [ 3 ] |
Link |
This issue is blocked by |
Status | In Progress [ 3 ] | Stalled [ 10000 ] |
Sprint | 10.2.4-4, 10.1.20 [ 117, 119 ] | 10.2.4-4, 10.1.20, 5.5.54 [ 117, 119, 123 ] |
Rank | Ranked higher |
Link |
This issue relates to |
Sprint | 10.2.4-4, 10.1.20, 5.5.54 [ 117, 119, 123 ] | 10.2.4-4, 10.1.20, 10.2.4-5 [ 117, 119, 125 ] |
Rank | Ranked lower |
Status | Stalled [ 10000 ] | In Progress [ 3 ] |
Assignee | Varun [ varun ] | Sergei Petrunia [ psergey ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Sprint | 10.2.4-4, 10.1.20, 10.2.4-5 [ 117, 119, 125 ] | 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1 [ 117, 119, 125, 132 ] |
Rank | Ranked lower |
Sprint | 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1 [ 117, 119, 125, 132 ] | 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2 [ 117, 119, 125, 132, 134 ] |
Fix Version/s | 10.3 [ 22126 ] | |
Fix Version/s | 10.3.0 [ 22127 ] |
Sprint | 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2 [ 117, 119, 125, 132, 134 ] | 10.2.4-4, 10.1.20, 10.2.4-5, 10.2.4-1, 10.2.4-2, 10.3.3-1 [ 117, 119, 125, 132, 134, 200 ] |
Assignee | Sergei Petrunia [ psergey ] | Oleksandr Byelkin [ sanja ] |
Assignee | Oleksandr Byelkin [ sanja ] | Varun Gupta [ varun ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
Status | Stalled [ 10000 ] | In Progress [ 3 ] |
Fix Version/s | 10.3.3 [ 22644 ] | |
Fix Version/s | 10.3 [ 22126 ] | |
Resolution | Fixed [ 1 ] | |
Status | In Progress [ 3 ] | Closed [ 6 ] |
Link |
This issue is blocked by |
Workflow | MariaDB v3 [ 78321 ] | MariaDB v4 [ 133002 ] |
Zendesk Related Tickets | 133093 |
> Supporting LIMIT N[,M] could simplify queries,
This is the goal of this MDEV.
> avoid passing lots of data to the client
This is already achieved by using SUBSTRING_INDEX
> and may be optimized on the server side by not extracting complete GROUP BY ... ORDER BY result.
This is outside of scope of this MDEV.
Let's limit this task to providing LIMIT syntax, and query results.
Loose-scan like optimization for reading only a few rows per group is theoretically possible, but hard to do in the current MariaDB's (or MySQL's) GROUP/ORDER BY optimizer.