[MCOL-4777] Slow query on ColumnStore Created: 2021-06-25  Updated: 2022-08-29

Status: Open
Project: MariaDB ColumnStore
Component/s: ?
Affects Version/s: None
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Sebastien GIRAUD Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Hi one of our customer is facing slowness when performing some particular queries on ColumnStore 1.5 .

Here are these queries :
Query 1:
=======
DROP TABLE if EXISTS `fait_facturation` ;
1 queries executed, 1 success, 0 errors, 0 warnings
Query: DROP TABLE if EXISTS `fait_facturation`
0 row(s) affected
Execution Time : 0.578 sec
Transfer Time : 1.024 sec
Total Time : 1.603 sec

Query 2:
=======
CREATE TABLE `fait_facturation` (
`FACTURATION_ID` varchar(20) DEFAULT NULL,
`PERIODE_ETP` varchar(7) DEFAULT NULL,
`CODE_ETT_FACTURATION` varchar(50) DEFAULT NULL) engine=columnstore;
1 queries executed, 1 success, 0 errors, 0 warnings
Query: CREATE TABLE `fait_facturation` ( `FACTURATION_ID` varchar(20) DEFAULT NULL, `PERIODE_ETP` varchar(7) DEFAULT NULL, `CODE_ETT_FA...

0 row(s) affected
Execution Time : 0.659 sec
Transfer Time : 1.024 sec
Total Time : 1.683 sec

Query 3:
=======
INSERT INTO `fait_facturation`(
with recursive series as (
select 1 as FACTURATION_ID union all
select FACTURATION_ID +1 as id from series
where FACTURATION_ID < 500000 )
select FACTURATION_ID,concat(CHAR(65+MOD(ROUND(RAND()*100),26)),CHAR(65+MOD(ROUND(RAND()*100),26)),CHAR(65+MOD(ROUND(RAND()*100),26)))as PERIODE_ETP ,
SUBSTRING(REPLACE(REPLACE(REPLACE( TO_BASE64(MD5(RAND())), '=',''),'+',''),'/',''), 2, 40) AS CODE_ETT_FACTURATION from series);

1 queries executed, 1 success, 0 errors, 0 warnings
Query: INSERT INTO `fait_facturation`( with recursive series as ( select 1 as FACTURATION_ID union all select FACTURATION_ID +1 as id f...
500000 row(s) affected
Execution Time : 3.421 sec
Transfer Time : 0 sec
Total Time : 3.422 sec

Query 4:
========
SELECT PERIODE_ETP, CODE_ETT_FACTURATION, COUNT(1), COUNT(FACTURATION_ID)
FROM fait_facturation
GROUP BY PERIODE_ETP,CODE_ETT_FACTURATION,FACTURATION_ID
ORDER BY PERIODE_ETP DESC ,CODE_ETT_FACTURATION,FACTURATION_ID;

Exec: 2,504s Total: 4,132s 500000 rows

Query 4 bis (very similar to the previous one)
=========
SELECT PERIODE_ETP, CODE_ETT_FACTURATION, COUNT(1), COUNT(DISTINCT FACTURATION_ID)
FROM fait_facturation
GROUP BY 1,2
ORDER BY 1 DESC ,2;

On CS 1.2 the query is executed in 47 secondes and on CS 10.5 the query is executed in 98s



 Comments   
Comment by Sebastien GIRAUD [ 2021-09-21 ]

Also the following Jira : https://jira.mariadb.org/browse/MCOL-4759

Comment by Roman [ 2022-02-14 ]

sebastien.giraud Could I ask you to collect calsetrace/calgettrace runtime statistics for the query taken with MCS 1.2 and 10.5?

Comment by Roman [ 2022-08-29 ]

sebastien.giraud The actual workaround will be using a latin1 encoding and latin1_bin_nopad collation to query the table. This will give the speed of 1.2. In many cases the current default uses utf8 encoding and corresponding collations that makes GROUP BY slower. This is an implicit behavior change that we faced introducing collation support.

Generated at Thu Feb 08 02:52:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.