[MDEV-26591] Is MariaDB really caching or reusing “WITH” intermediary results? Created: 2021-09-11  Updated: 2023-07-06

Status: Open
Project: MariaDB Server
Component/s: Optimizer - CTE
Affects Version/s: 10.3.29, 10.5.11
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Gilles Reeves Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: optimizer, performance
Environment:

Linux Debian 10 or 11, dedicated server or local test machine


Attachments: PNG File nAw6E.png    
Issue Links:
Duplicate
is duplicated by MDEV-30017 MariaDB optimizer significantly slowe... Open
PartOf
is part of MDEV-28906 MySQL 8.0 desired compatibility Open
Relates
relates to MDEV-30312 Improvement: make derived_with_keys n... Open

 Description   

I have a MariaDB SQL query (actually very big) using chaining WITH statements like this :

WITH tbl_base AS (
 
    -- Base level sub-query on a table containing about 22000 rows, using LEFT JOIN on 5 other tables, details not relevant here
    SELECT ……… FROM <many things>
 
), tbl_middle AS (
 
    -- Intermediate level sub-query #1 for intermediate computing, details not relevant here
    SELECT ……… FROM tbl_middle …
 
), tbl_states AS (
 
    -- Intermediate level sub-query #2 for intermediate computing, details not relevant here
    SELECT
        ……… AS state,
        ……… AS `elec`
    FROM    tbl_states …
 
), tbl_sums AS (
 
    -- Final grouping query
    SELECT
        `state`,
        SUM(NOT `elec`) AS `vls`,
        SUM(`elec`) AS `vae`,
        count(`state`) AS `all`
    FROM tbl_states
    GROUP BY `state`
    ORDER BY `state`
 
)
 
-- Additional query which could be avoided if left as simple as this but it will not stay as simple as this
SELECT `state`, `vae`, `vls`, `both` FROM tbl_sums

This query makes statistics on two kinds of objects which can each have 8 different states (bad, longrun, lost, maint, ok, removed, run or warn), so the result is a tiny view of only 8 rows and 4 columns (state, vls, vae and both).

On my server it needs about 500 mS to execute, which is acceptable.

Now I want to add rows for creating some combined states, so I replaced the last SELECT by this UNION:

SELECT `state`, `vae`, `vls`, `both` FROM tbl_sums
UNION
SELECT 'all_sta',SUM(`vls`),SUM(`vae`),SUM(`both`) FROM tbl_sums WHERE `state` IN ('ok','warn','bad','maint')
UNION
SELECT 'all_run',SUM(`vls`),SUM(`vae`),SUM(`both`) FROM tbl_sums WHERE `state` IN ('run','longrun')
UNION
SELECT 'operative',SUM(`vls`),SUM(`vae`),SUM(`both`) FROM tbl_sums WHERE `state` IN ('ok','warn','run','longrun')
UNION
SELECT 'unusable',SUM(`vls`),SUM(`vae`),SUM(`both`) FROM tbl_sums WHERE `state` IN ('bad','maint')
UNION
SELECT 'present',SUM(`vls`),SUM(`vae`),SUM(`both`) FROM tbl_sums WHERE `state` IN ('ok','warn','bad','maint','run','longrun')
UNION
SELECT 'missing',SUM(`vls`),SUM(`vae`),SUM(`both`) FROM tbl_sums WHERE `state` IN ('removed','lost')
UNION
SELECT 'total',SUM(`vls`),SUM(`vae`),SUM(`both`) FROM tbl_sums

I works fine, but I am very surprised to observe that the query needs now about 4 seconds to run, thus 8 times the initial 500 mS, likely because I am reusing the last WITH element 8 times.

Apparently MariaDB execute this by computing 8 times the whole sub-query chain instead of temporarily storing the last one and reusing it 8 times, whereas I was expecting it to be able to reuse the last tiny result in not even a millisecond.

If I use EXPLAIN on the last query, I get a result like this (see joined picture) where we could clearly see that the whole sub-queries are effectively explored 8 times.

So I'd like to know if I made something bad, or if I should change some settings in order to have it working better, or if this is simply a bug in MariaDB (versions 10.3.29 or 10.5.11 in this case), as WITH is a quite new feature in MariaDB.

I know I could create a temporary table, but as it is very complicated to execute multiple statements for ONE result in my application, I was precisely expecting to use WITH in order to avoid it.

This is likely irrelevant, but the server is running on Linux Debian 11.

Also, I do that using MyISAM tables. Not tried with others.



 Comments   
Comment by Gilles Reeves [ 2023-05-26 ]

Hello.

I posted this about 20 months ago.
Does someone even read it?

Comment by Daniel Black [ 2023-07-06 ]

I have now Gingko. I thought it sounded familiar. Ouch, an embarrassing lack of optimization.

Generated at Thu Feb 08 09:46:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.