[MDEV-33831] Group by inside subquery / CTE: Very bad row estimate - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Duplicate
Affects Version/s: 11.3.2
Fix Version/s: 11.4.6
Component/s: Optimizer, Optimizer - CTE
Labels:
- optimizer-feature
- performance
Environment:
Using the official docker image with default configuration (other than credentials)

Description

For my work I am using an absolutely massive query. This query has been expanded over time, but lately the query is becoming extremely slow when I try to join the result on another table (one-to-one relation) to the point of not finishing in 1 hour+. Without that join, the query is running in ~2.5 minutes. Interestingly, the query has similar performance if instead of joining I add the column from the table that I need via a correlated subquery. This points to the optimizer using an inefficient query plan, and using a correlated subquery forces a much better plan.

So I did some digging, and it looks like the row estimate is off by a LOT. I have ensured the table statistics are updated, so that part can be ruled out. The bug that I am seeing seems to be present in multiple different CTEs used within our query, and this issue is compounding whenever I try to join the result of a CTE on something else, as each CTE has a row estimate that is way too high. In the end, our final query has a row estimate of quadrillions of rows, while in reality it has a much more modest ~150k rows.

I have extracted a specific very simple CTE that exhibits the issue, together with the dataset that triggers the issue. Nothing interesting is happening at all in this query because I have simplified it down as much as possible while still exhibiting the issue. Obviously the real query is using more fields and the CTE is used usefully instead of simply SELECTing from the CTE.

ANALYZE

WITH demo AS

	SELECT

	u.id

	FROM users u

	LEFT JOIN activity_users au

		ON au.user_id = u.id

	LEFT JOIN activity_user_signoffs aus

		ON aus.activity_user_id = au.id

	GROUP BY u.id

SELECT * FROM demo

Running this query, I get the result shown on the "bad_estimate" image. It estimates 544k rows, but in reality it's only 23k. Interestingly, when I remove the outer SELECT, the estimate is accurate:

ANALYZE

SELECT

u.id

FROM users u

LEFT JOIN activity_users au

	ON au.user_id = u.id

LEFT JOIN activity_user_signoffs aus

	ON aus.activity_user_id = au.id

GROUP BY u.id

The interesting second observation, is that I can keep the CTE and get an accurate estimate if I remove the GROUP BY statement in the CTE:

ANALYZE

WITH demo AS

	SELECT

	u.id

	FROM users u

	LEFT JOIN activity_users au

		ON au.user_id = u.id

	LEFT JOIN activity_user_signoffs aus

		ON aus.activity_user_id = au.id

SELECT * FROM demo

Obviously this is just a toy example, and the bad estimates don't actually slow anything down as the query plan is identical. However, when you were to use this CTE to join it with other CTEs and/or tables, then this wrong estimate can make the optimizer pick horribly inefficient query plans. Attached you will find the .sql database dump that can be used to reproduce this issue.

Please let me know if there is anything else I can provide to help

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

bad_estimate-3.png
2024-04-04 12:50
47 kB
Jaap Buurman
dump-database-202404041444.sql
2024-04-04 12:51
3.31 MB
Jaap Buurman
join-cardinality.png
2024-04-24 15:51
6 kB
Galina Shalygina
selectivity of the predicate.png
2024-04-24 15:56
23 kB
Galina Shalygina

Issue Links

relates to

MDEV-30877 Wrong cardinality estimation for the derived table leads to slow plan with LATERAL DERIVED

Closed

Activity

People

Assignee:: Sergei Petrunia

Reporter:: Jaap Buurman

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2024-04-04 12:51

Updated:: 2025-02-11 08:29

Resolved:: 2025-02-11 08:29

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.