Here I copy/paste the message sent to our Google group.
While this didn't always appear to be the case, it appears the performance of percentile functions against larger sets of data has deteriorated significantly where it is pretty much unusable. I tried to do some searching and could find much so I wanted to raise it here.
A query like such on a row count of 10k might take 5 secs, maybe better
MariaDB [hop_2021]> SELECT DISTINCT '2021-05-25' as dateval, t.*, count FROM (
-> PERCENTILE_CONT(.10) WITHIN GROUP (ORDER BY host_latency) OVER () as p1,
-> PERCENTILE_CONT(.5) WITHIN GROUP (ORDER BY host_latency) OVER () as median,
-> PERCENTILE_CONT(.90) WITHIN GROUP (ORDER BY host_latency) OVER () as p90,
-> PERCENTILE_CONT(.99) WITHIN GROUP (ORDER BY host_latency) OVER () as p99
-> FROM tblpivot_apps
-> WHERE host_latency > 0
-> AND capture_date = '2021-05-25'
-> ) t;
When reaching into the ~500k range it seems to hang indefinitely.
- Resources on the host appear fine, however increased load noticed.
- Columnstore logs are clean
- Other window functions like quartiles work just fine
I then converted the table to innodb and re-ran the query on the larger set and it returned in seconds. I am happy to go that route, but just wanted to bring closure as to why.
I was also tempted to run some regression to see if it was tied to a version, but I don't quite have the time to do so, haha. Any insight on this topic would be greatly appreciated! Is it expected that CS is not great at handling these types of functions?
This server is running ->
mariadb-server: 10.5.8 w/ columnstore plugin