[MCOL-1031] create non columnstore table from columnstore select with joins is slow Created: 2017-11-14  Updated: 2019-07-10  Resolved: 2019-07-10

Status: Closed
Project: MariaDB ColumnStore
Component/s: MariaDB Server
Affects Version/s: 1.1.1
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: David Thompson (Inactive) Assignee: Unassigned
Resolution: Won't Fix Votes: 2
Labels: None


 Description   

Generate a large date file and load this to a table loans_table:

mcmsysql test
 
create table loan_test (
loanid varchar(40),
PRODUCTTYPE varchar(120),
Center varchar(120)
) engine=columnstore;
exit;
 
for i in $(seq 0 8981822); do echo "loanid_$i|PRODUCTTYPE_$i|Center"; done > loan_test.tbl
cpimport test loan_test loan_test.tbl

The following create table will be very slow (hours):

create table tempdata_error as select distinct a.center from loan_test a inner join loan_test b on (a.loanid = b.loanid);

Change this to insert select and it will be fast:

create table tempdata_error(Center varchar(120)) engine=MyISAM;
insert into tempdata_error select distinct a.center from loan_test a inner join loan_test b on (a.loanid = b.loanid);

I believe what is happening is that the create table is getting executed as effectively vtable mode 0 so the joins are done as nested loop joins assuming indexes which is very slow. The insert into select is correctly pushed down to columnstore and so executes at a more appropriate speed.



 Comments   
Comment by Thorsten [ 2018-02-01 ]

Hi,
I have similar behavior:
The columstore table is:fact_nvi_kt_de_13680
I likte have only a small extraction form the columstore table into a innoDB result table.

Normaly "create tabla xy as select ..." are very fast but not now.

If I make: create table tmp as select * from (select pzn_id,sum(VAL_36_200) Packungen_MariaDB from nvi_kt_de_13680_201712.fact_nvi_kt_de_13680 f group by f.pzn_id) t where Packungen_MariaDB>0 ;
or I use having then the statement need 8:03 minutes.

if I create the table and use insert

CREATE TABLE `tmp` (`pzn_id` INT(10) UNSIGNED NULL DEFAULT NULL, `Packungen_MariaDB` DECIMAL(32,0) NULL DEFAULT NULL ) COLLATE='utf8_general_ci' ENGINE=InnoDB;
insert into tmp select pzn_id,sum(VAL_36_200) Packungen_MariaDB from fact_nvi_kt_de_13680 f group by f.pzn_id having sum(VAL_36_200)>0;

the statement ist ready after 2 seconds.

I don´t understand this behavior.

And the second thing is:
if I try to use it in a block I get the slow results from the "create table as select" statement.
delimiter $

BEGIN NOT ATOMIC
declare c_crlf varchar(2) default concat(char(13),char(10));

set @tmp_sql ="DROP TABLE IF EXISTS tmp ";
prepare stmt from @tmp_sql ;
select @tmp_sql;
execute stmt;
deallocate prepare stmt;
select now();

set @tmp_sql ="CREATE TABLE `tmp` (`pzn_id` INT(10) UNSIGNED NULL DEFAULT NULL, `Packungen_MariaDB` DECIMAL(32,0) NULL DEFAULT NULL ) COLLATE='utf8_general_ci' ENGINE=InnoDB";
prepare stmt from @tmp_sql ;
select @tmp_sql;
execute stmt;
deallocate prepare stmt;
select now();

set @tmp_sql ="insert into tmp select pzn_id,sum(VAL_36_200) Packungen_MariaDB from fact_nvi_kt_de_13680 f group by f.pzn_id having sum(VAL_36_200)>0";
prepare stmt from @tmp_sql ;
select @tmp_sql;
execute stmt;
deallocate prepare stmt;
select now(); – dauer 8 min

END $

Can anyone explain it?

Thanks for help.

Thorsten
delimiter ;

Comment by Andrew Hutchings (Inactive) [ 2019-07-10 ]

Should go away in 1.4

Generated at Thu Feb 08 02:25:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.