[MDEV-19134] EXISTS() slower if ORDER BY is defined Created: 2019-04-02 Updated: 2019-05-19 Resolved: 2019-05-16 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Data Manipulation - Subquery, Optimizer |
| Affects Version/s: | 10.1.38, 10.3.13 |
| Fix Version/s: | 10.4.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | David Rodrigues | Assignee: | Sergei Petrunia |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | performance | ||
| Description |
|
In some specific cases, EXISTS() will slowdown if ORDER BY is defined in the subquery. Why is the reason to apply an ORDER BY to EXISTS()? None. But Laravel does that and I can't control it. The question is: ORDER BY inside of an EXISTS() will slowdown in a specific case. I think that the optimizer is not ignoring that. Setup:
Populate:
Slow query:
Cleanup:
What I realized:
And about the parameter column: I created this column just to make sure that it will returns only the `table_a.id = 123` as an additional parameter. In this case, I could run a different query with the same behaviour using `IN()` that will not will slowdown (0.000s vs. 1.516s for 1250 x 1250 rows).
Explain: The EXPLAIN is a bit different.
About ORDER BY insde EXISTS() This query is generated by Laravel framework when we use the whereHas() method. Because of a Scope it add the ORDER BY in this place. As workaround I have modified the code to use IN() instead of EXISTS(), but I think that it could be optimized to work fine in all the cases (even with ORDER BY). |
| Comments |
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
ANALYZE FORMAT=JSON output on the current 10.1.40:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
If I remove the ORDER BY, the subquery is merged into a semi-join:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
10.4 (debug build)
note that I don't get the sorting step here (but the bug report says it's done?) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Ok, If I run the ANALYZE FORMAT=JSON immediately after the INSERT statements (without a server restart in the middle), I get the sort step, too:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
ORDER BY without LIMIT can be removed for EXISTS/IN subqueries. For IN subqueries, LIMIT is not a concern, because MariaDB (and MySQL) doesn't support them. The queries will be rejected with this error:
EXISTS subqueries allow LIMIT. "ORDER BY ... LIMIT n" should be removable. "ORDER BY ... LIMIT x OFFSET y" is not removable. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Found this piece in item_subselect.cc, Item_in_subselect::select_in_like_transformer
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Investigated. The optimizer does these query rewrites in this order:
For the EXISTS subquery from this bug report, the following happens:
In explain, we see sorting, and no merging. For the IN-subquery from this bug report, the following happens:
In explain, we see that sorting is not done, but merging is done. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Committed a patch that makes the order of transformations to be:
Now, for the IN-subquery from this bug report:
and we should get a good query plan. For the EXISTS subquery from this bug report:
The sorting is gone but we are not able to merge it to semi-join yet. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The Step #1 patch: http://lists.askmonty.org/pipermail/commits/2019-May/013738.html . sanja could you please review? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've also tried to make EXISTS->IN transformation work when ORDER BY is present but didn't succeed. I'm hitting this assert in Item_exists_subselect::exists2in_processor
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Discussed with Sanja. The takeaways are:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-10 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Patch for Step #2: http://lists.askmonty.org/pipermail/commits/2019-May/013758.html | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-10 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Sanja, please review both patches. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleksandr Byelkin [ 2019-05-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
OK to push (both) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2019-05-16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
rentalhost, thanks for reporting this! The fix is pushed into MariaDB 10.4 tree. It should be trivial to apply the patch to previous MariaDB versions as well. The reason we are fixing this only in 10.4 is that it changes the query plans which may potentially cause a slowdown for somebody. |