[MDEV-11563] GROUP_CONCAT(DISTINCT ...) may produce a non-distinct list Created: 2016-12-14 Updated: 2020-06-12 Resolved: 2020-06-09 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Affects Version/s: | 5.5, 10.0, 10.1, 10.2 |
| Fix Version/s: | 10.5.4, 10.3.24, 10.4.14 |
| Type: | Bug | Priority: | Major |
| Reporter: | Varun Gupta (Inactive) | Assignee: | Varun Gupta (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Sprint: | 10.2.4-5, 10.2.7-1 | ||||||||||||||||||||||||
| Description |
|
Here is a simple test case demonstrating the problem. This problem is hit only in the case when records from all
|
| Comments |
| Comment by Varun Gupta (Inactive) [ 2016-12-18 ] | |||||||||||||||||||||||||||||||||||
|
The above test Works on MYSQL-5.7 | |||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2016-12-25 ] | |||||||||||||||||||||||||||||||||||
|
The unique-ness is used by Item_func_group_concat::unique_filter. Grep for 'unique_filter' in sql/*.{h,cc}. One can find
And that's it. There is no Unique::get() call anywhere. Checking unique_filter->elements_in_tree() before/after the Unique::unique_add() works only when the Unique object has one tree. When multiple trees are present, DISTINCT will not work for elements in the different trees. | |||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2016-12-25 ] | |||||||||||||||||||||||||||||||||||
|
I've tried digging in the history of this code. It has been written ages ago (about 2006); it doesn't look like DISTINCT was ever properly supported in GROUP_CONCAT. | |||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2016-12-25 ] | |||||||||||||||||||||||||||||||||||
|
A general approach to compute
is to
One can take shortcuts in various special cases. One of them is @@group_concat_max_len, which is defined as
This allows to infer that Unique object will accomodate all the data in one tree | |||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2017-08-03 ] | |||||||||||||||||||||||||||||||||||
|
When we limit the size of memory for the tree used for distinct in GROUP, we see a non-unique list
| |||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2017-08-03 ] | |||||||||||||||||||||||||||||||||||
|
After porting the patch from mysql, I see that distinct with GROUP_CONCAT now crashes.
| |||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2017-08-04 ] | |||||||||||||||||||||||||||||||||||
|
In the function Unique::Unique
| |||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2019-05-02 ] | |||||||||||||||||||||||||||||||||||
|
Patch | |||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2020-03-05 ] | |||||||||||||||||||||||||||||||||||
|
Another test case. The effect is the opposite, distinct values are removed from the results set. Same root cause:
| |||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2020-06-07 ] | |||||||||||||||||||||||||||||||||||
|
Review input provided over email. Ok to push after it is addressed. | |||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2020-06-08 ] | |||||||||||||||||||||||||||||||||||
|
Have made a patch for 10.3 currently, the patch for 10.3 needs extra handling for the LIMIT clause of GROUP_CONCAT which was added in | |||||||||||||||||||||||||||||||||||
| Comment by Varun Gupta (Inactive) [ 2020-06-09 ] | |||||||||||||||||||||||||||||||||||
|
Fix is pushed to 10.3, if needed for earlier versions this can be backported to 10.1 and 10.2 (minor changes are required for that) |