[MDEV-20057] Distinct SUM on CROSS JOIN and grouped returns wrong result Created: 2019-07-12 Updated: 2023-07-26 Resolved: 2023-02-17 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Affects Version/s: | 5.5, 10.0, 10.1, 10.4.6, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10 |
| Fix Version/s: | 10.11.3, 11.0.1, 10.4.29, 10.5.20, 10.6.13, 10.7.8, 10.8.8, 10.9.6, 10.10.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | - | Assignee: | Michael Widenius |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | 11.0-sel, optimizer, upstream | ||
| Environment: |
Ubuntu 18.04, running MariaDB 10.4.6 in Docker |
||
| Issue Links: |
|
||||||||
| Description |
|
This sql query returns a wrong result:
Postgres 11.3 returns (1, 0) and (1,1), MariaDB in Version 10.4.6 and MySQL in Version 8.0.16 both only return (1, 1). Manually verified, the Postgres result is the correct one (group by results into two rows, sum > 2 is one for one group and false for the other. Therefore, I expect that MariaDB should also return 1,1 and 1,0. Removing one DISTINCT, one of the columns or > 2 removes the issue. I reduced the testcase as much as possible, therefore I replaced columns with constant if possible. The program now looks a bit constructed, but with columns, the bug also occurs. |
| Comments |
| Comment by Alice Sherepa [ 2019-10-10 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Thanks for the report! Reproducible on 5.5-10.4:
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2023-02-15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
This test case exposed a serious bug in MariaDB distinct handling. What was not handled was SELECT DISTINCT on expressions with sum functions. Will be fixed by extending remove_dup_with_hash_index() and remove_dup_with_compare() to take into account the columns in the result list and not only work on the fields in the internal temporary table (which only stores the values of the individual sum() operations). Note that in many cases the above dup removal functions are not used as the optimizer may be able to either remove duplicates early or it will discover that duplicate remove is not needed, for example if the group by fields is part of the result. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2023-02-16 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Waiting for review by Sergei Petrunia | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2023-02-17 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Pushed to 10.4 |