[MCOL-3903] Performance regression in INSERT..SELECT Created: 2020-03-26  Updated: 2021-07-08  Resolved: 2021-07-08

Status: Closed
Project: MariaDB ColumnStore
Component/s: MDB Plugin
Affects Version/s: 1.4.3
Fix Version/s: N/A, 1.4.5

Type: Bug Priority: Critical
Reporter: Roman Assignee: Roman
Resolution: Won't Do Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates MCOL-545 Bulk operation produce IO on UM Closed
Relates
relates to MCOL-3929 SELECT INTO..OUTFILE Closed
relates to MDEV-16327 Server doesn't account for engines th... Closed
relates to MDEV-18553 Make LIMIT/OFFSET processing uniform Closed
Sprint: 2020-5, 2020-6, 2020-7

 Description   

*strong text*After integration, INSERT SELECT is no longer using "vtable" like method for executing SELECT part (does it via ROW interface). The fix is to push SELECT into Select Handler (a 1.4 replacement of vtable).

As of 1.4.3 MCS uses table API to query the data running INSERT..SELECT. Mentioned ENT-669 backports 10.5 feature described by CLX-77 into 10.4 Ent. The feature enables Select Handler to execute the query part.

1.4 forces MCS to skip OFFSET processing b/c MDB applies OFFSET on its own.
1.5 contains changes that allows to push OFFSET down to a smart engine.



 Comments   
Comment by Roman [ 2020-03-26 ]

4 QA: INSERT..SELECT part can be tested using working_tpch1/misc/insert_select.sql.

The main expectation will be reduced execution time for INSERT..SELECT queries.

Comment by Roman [ 2020-03-26 ]

Plz review.

Comment by Gagan Goel (Inactive) [ 2020-03-30 ]

For QA: This is a performance improvement patch for insert...select queries, so it should not break any existing functionality.

I would like to mention, the current working_tpch1/misc/insert_select.sql will be failing after this patch. This is however unrelated to this issue, and a separate ticket has been opened for it: https://jira.mariadb.org/browse/MCOL-3912

Comment by Roman [ 2020-04-01 ]

There is a regression in expBetween.sql introduced by the original patch.

Comment by Daniel Lee (Inactive) [ 2020-04-01 ]

Build verified: 1.4.3-5 hot fix, Azure 20200401.6

Execute regression test suite, insert_select.sql

[root@localhost tests]# mysql tpch1 -vvv < /data/qa/shares/insert_select.sql
--------------
drop table if exists myisam_lineitem
--------------

Query OK, 0 rows affected, 1 warning (0.000 sec)

--------------
drop table if exists idb_lineitem
--------------

Query OK, 0 rows affected, 1 warning (0.000 sec)

--------------
create table idb_lineitem (
l_orderkey int,
l_partkey int,
l_suppkey int,
l_linenumber bigint,
l_quantity decimal(12,2),
l_extendedprice decimal(12,2),
l_discount decimal(12,2),
l_tax decimal(12,2),
l_returnflag char (1),
l_linestatus char (1),
l_shipdate date,
l_commitdate date,
l_receiptdate date,
l_shipinstruct char (25),
l_shipmode char (10),
l_comment varchar (44)
) engine=columnstore
--------------

Query OK, 0 rows affected (0.192 sec)

--------------
set autocommit = 0
--------------

Query OK, 0 rows affected (0.000 sec)

--------------
insert into idb_lineitem select * from lineitem limit 1000
--------------

Query OK, 1000 rows affected (0.740 sec)
Records: 1000 Duplicates: 0 Warnings: 0

--------------
select count from idb_lineitem
--------------

----------

count

----------

1000

----------
1 row in set (0.068 sec)

--------------
insert idb_lineitem (l_orderkey, l_partkey) select * from (select l_orderkey, l_partkey from lineitem, orders where l_orderkey=o_orderkey and l_orderkey<1000) sub
--------------

Query OK, 1004 rows affected (0.480 sec)
Records: 1004 Duplicates: 0 Warnings: 0

--------------
select count from idb_lineitem
--------------

----------

count

----------

2004

----------
1 row in set (0.013 sec)

--------------
insert into idb_lineitem (l_shipdate, l_partkey) (select now(), l_partkey from lineitem where l_orderkey <= 1000 order by l_orderkey limit 100)
--------------

Query OK, 100 rows affected, 100 warnings (0.303 sec)
Records: 100 Duplicates: 0 Warnings: 100

--------------
select count from idb_lineitem
--------------

----------

count

----------

2104

----------
1 row in set (0.015 sec)

--------------
rollback
--------------

Query OK, 0 rows affected (0.897 sec)

--------------
create table myisam_lineitem (
l_orderkey int,
l_partkey int,
l_suppkey int,
l_linenumber bigint,
l_quantity decimal(12,2),
l_extendedprice decimal(12,2),
l_discount decimal(12,2),
l_tax decimal(12,2),
l_returnflag char (1),
l_linestatus char (1),
l_shipdate date,
l_commitdate date,
l_receiptdate date,
l_shipinstruct char (25),
l_shipmode char (10),
l_comment varchar (44)
)
--------------

Query OK, 0 rows affected (0.005 sec)

--------------
insert into myisam_lineitem select * from lineitem where l_orderkey <= 500 and l_linenumber = 1
--------------

Query OK, 127 rows affected (0.180 sec)
Records: 127 Duplicates: 0 Warnings: 0

--------------
select count from myisam_lineitem
--------------

----------

count

----------

127

----------
1 row in set (0.000 sec)

--------------
insert into myisam_lineitem (l_orderkey) select count from lineitem group by l_linenumber order by l_linenumber
--------------

Query OK, 7 rows affected (0.356 sec)
Records: 7 Duplicates: 0 Warnings: 0

--------------
select count from myisam_lineitem
--------------

----------

count

----------

134

----------
1 row in set (0.000 sec)

--------------
insert idb_lineitem (l_orderkey, l_shipdate) (select o_orderkey, o_orderdate from orders where o_orderkey <= 400)
--------------

Query OK, 103 rows affected (0.299 sec)
Records: 103 Duplicates: 0 Warnings: 0

--------------
select count from idb_lineitem
--------------

----------

count

----------

103

----------
1 row in set (0.012 sec)

--------------
insert into idb_lineitem (l_orderkey) select o_orderkey from orders, myisam_lineitem where l_orderkey = o_orderkey
--------------

Query OK, 0 rows affected (0.019 sec)
Records: 0 Duplicates: 0 Warnings: 0

--------------
select count from idb_lineitem
--------------

----------

count

----------

103

----------
1 row in set (0.012 sec)

--------------
rollback
--------------

Query OK, 0 rows affected (0.399 sec)

--------------
insert low_priority idb_lineitem (l_orderkey) select l_orderkey from lineitem limit 1000
--------------

Query OK, 1000 rows affected (0.291 sec)
Records: 1000 Duplicates: 0 Warnings: 0

--------------
insert ignore idb_lineitem (l_orderkey) select l_orderkey from lineitem where l_orderkey < 100 order by 1 ON DUPLICATE KEY UPDATE l_orderkey = 5
--------------

Query OK, 105 rows affected (0.323 sec)
Records: 105 Duplicates: 0 Warnings: 0

--------------
select count from idb_lineitem
--------------

----------

count

----------

1105

----------
1 row in set (0.014 sec)

--------------
rollback
--------------

Query OK, 0 rows affected (0.463 sec)

Bye
[root@localhost tests]#

Comment by Roman [ 2020-04-08 ]

Plz review the PR for 1.5

Comment by Gagan Goel (Inactive) [ 2020-04-21 ]

For QA: The patch is also merged in current develop (1.5). Please ensure the same performance gains are also visible in 1.5.

Comment by Daniel Lee (Inactive) [ 2020-05-01 ]

Build tested: 1.4.4-1, 1.5.0-1 source

1.4.4-1

/root/ColumnStore/buildColumnstoreFromGithubSource/server
commit 00abe03ad1da3719e06f7112000a331ee2b6786a
Author: Patrick LeBlanc <43503225+pleblanc1976@users.noreply.github.com>
Date: Wed Apr 29 10:00:54 2020 -0500

/root/ColumnStore/buildColumnstoreFromGithubSource/server/engine
commit 2b67ac7f3537bd4b4d132c8a6c3a53e4cc63f4a1
Merge: beaac49 23d65dc
Author: benthompson15 <ben.thompson.015@gmail.com>
Date: Tue Apr 28 15:40:45 2020 -0500

1.5.0-1

/root/ColumnStore/buildColumnstoreFromGithubSource/server
commit 25eb50d6c002e987e2d240402391549d408c18d9
Author: Alexey Bychko <abychko@gmail.com>
Date: Thu Apr 23 12:36:13 2020 +0700

commit 6ad38ccc28d31a099d052e5de827543808843a3c
Merge: 658abae bb3e76b
Author: benthompson15 <ben.thompson.015@gmail.com>
Date: Fri Apr 24 14:31:09 2020 -0500

1.4.4-1 and 1.5.0-1 has about the same performance.
insert-select timing is about twice of the test results 1.4.3-5 (see above comment). The following is from 1.5.0-1
[centos7:root~]# mysql tpch1 -vvv < /data/qa/shares/insert_select.sql
--------------
drop table if exists myisam_lineitem
--------------

Query OK, 0 rows affected, 1 warning (0.000 sec)

--------------
drop table if exists idb_lineitem
--------------

Query OK, 0 rows affected, 1 warning (0.000 sec)

--------------
create table idb_lineitem (
l_orderkey int,
l_partkey int,
l_suppkey int,
l_linenumber bigint,
l_quantity decimal(12,2),
l_extendedprice decimal(12,2),
l_discount decimal(12,2),
l_tax decimal(12,2),
l_returnflag char (1),
l_linestatus char (1),
l_shipdate date,
l_commitdate date,
l_receiptdate date,
l_shipinstruct char (25),
l_shipmode char (10),
l_comment varchar (44)
) engine=columnstore
--------------

Query OK, 0 rows affected (0.247 sec)

--------------
set autocommit = 0
--------------

Query OK, 0 rows affected (0.000 sec)

--------------
insert into idb_lineitem select * from lineitem limit 1000
--------------

Query OK, 1000 rows affected (1.454 sec)
Records: 1000 Duplicates: 0 Warnings: 0

--------------
select count from idb_lineitem
--------------

----------

count

----------

1000

----------
1 row in set (0.080 sec)

--------------
insert idb_lineitem (l_orderkey, l_partkey) select * from (select l_orderkey, l_partkey from lineitem, orders where l_orderkey=o_orderkey and l_orderkey<1000) sub
--------------

Query OK, 1004 rows affected (0.927 sec)
Records: 1004 Duplicates: 0 Warnings: 0

--------------
select count from idb_lineitem
--------------

----------

count

----------

2004

----------
1 row in set (0.029 sec)

--------------
insert into idb_lineitem (l_shipdate, l_partkey) (select now(), l_partkey from lineitem where l_orderkey <= 1000 order by l_orderkey limit 100)
--------------

Query OK, 100 rows affected, 100 warnings (0.631 sec)
Records: 100 Duplicates: 0 Warnings: 100

--------------
select count from idb_lineitem
--------------

----------

count

----------

2104

----------
1 row in set (0.028 sec)

--------------
rollback
--------------

Query OK, 0 rows affected (1.698 sec)

--------------
create table myisam_lineitem (
l_orderkey int,
l_partkey int,
l_suppkey int,
l_linenumber bigint,
l_quantity decimal(12,2),
l_extendedprice decimal(12,2),
l_discount decimal(12,2),
l_tax decimal(12,2),
l_returnflag char (1),
l_linestatus char (1),
l_shipdate date,
l_commitdate date,
l_receiptdate date,
l_shipinstruct char (25),
l_shipmode char (10),
l_comment varchar (44)
)
--------------

Query OK, 0 rows affected (0.011 sec)

--------------
insert into myisam_lineitem select * from lineitem where l_orderkey <= 500 and l_linenumber = 1
--------------

Query OK, 127 rows affected (0.438 sec)
Records: 127 Duplicates: 0 Warnings: 0

--------------
select count from myisam_lineitem
--------------

----------

count

----------

127

----------
1 row in set (0.003 sec)

--------------
insert into myisam_lineitem (l_orderkey) select count from lineitem group by l_linenumber order by l_linenumber
--------------

Query OK, 7 rows affected (2.006 sec)
Records: 7 Duplicates: 0 Warnings: 0

--------------
select count from myisam_lineitem
--------------

----------

count

----------

134

----------
1 row in set (0.003 sec)

--------------
insert idb_lineitem (l_orderkey, l_shipdate) (select o_orderkey, o_orderdate from orders where o_orderkey <= 400)
--------------

Query OK, 103 rows affected (1.066 sec)
Records: 103 Duplicates: 0 Warnings: 0

--------------
select count from idb_lineitem
--------------

----------

count

----------

103

----------
1 row in set (0.031 sec)

Comment by Roman [ 2020-05-04 ]

Spent most of the day trying this from different angles can't reproduce though.
Here is the diff of 1.4.3-5 and 1.4.4-1

--- /data/3903_1.4.txt	2020-05-04 13:11:05.445557690 +0000
+++ /data/3903_1.4_latest.txt	2020-05-04 14:55:23.687695056 +0000
@@ -31,7 +31,7 @@
 ) engine=columnstore
 --------------
 
-Query OK, 0 rows affected (0.603 sec)
+Query OK, 0 rows affected (0.541 sec)
 
 --------------
 set autocommit = 0
@@ -43,7 +43,7 @@
 insert into idb_lineitem select * from lineitem limit 1000
 --------------
 
-Query OK, 1000 rows affected (0.477 sec)
+Query OK, 1000 rows affected (0.442 sec)
 Records: 1000  Duplicates: 0  Warnings: 0
 
 --------------
@@ -55,13 +55,13 @@
 +----------+
 |     1000 |
 +----------+
-1 row in set (0.032 sec)
+1 row in set (0.020 sec)
 
 --------------
 insert idb_lineitem (l_orderkey, l_partkey) select * from (select l_orderkey, l_partkey from lineitem, orders where l_orderkey=o_orderkey and l_orderkey<1000) sub
 --------------
 
-Query OK, 1004 rows affected (0.321 sec)
+Query OK, 1004 rows affected (0.320 sec)
 Records: 1004  Duplicates: 0  Warnings: 0
 
 --------------
@@ -79,7 +79,7 @@
 insert into idb_lineitem (l_shipdate, l_partkey) (select  now(), l_partkey from lineitem where l_orderkey <= 1000 order by l_orderkey limit 100)
 --------------
 
-Query OK, 100 rows affected, 100 warnings (0.272 sec)
+Query OK, 100 rows affected, 100 warnings (0.263 sec)
 Records: 100  Duplicates: 0  Warnings: 100
 
 --------------
@@ -91,13 +91,13 @@
 +----------+
 |     2104 |
 +----------+
-1 row in set (0.010 sec)
+1 row in set (0.012 sec)
 
 --------------
 rollback
 --------------
 
-Query OK, 0 rows affected (0.507 sec)
+Query OK, 0 rows affected (0.503 sec)
 
 --------------
 create table myisam_lineitem (
@@ -120,13 +120,13 @@
 )
 --------------
 
-Query OK, 0 rows affected (0.011 sec)
+Query OK, 0 rows affected (0.013 sec)
 
 --------------
 insert into myisam_lineitem select * from lineitem where l_orderkey <= 500 and l_linenumber = 1
 --------------
 
-Query OK, 127 rows affected (0.063 sec)
+Query OK, 127 rows affected (0.062 sec)
 Records: 127  Duplicates: 0  Warnings: 0
 
 --------------
@@ -144,7 +144,7 @@
 insert into myisam_lineitem (l_orderkey) select count(*) from lineitem group by l_linenumber order by l_linenumber
 --------------
 
-Query OK, 7 rows affected (0.111 sec)
+Query OK, 7 rows affected (0.121 sec)
 Records: 7  Duplicates: 0  Warnings: 0
 
 --------------
@@ -162,7 +162,7 @@
 insert idb_lineitem (l_orderkey, l_shipdate) (select o_orderkey, o_orderdate from orders where o_orderkey <= 400)
 --------------
 
-Query OK, 103 rows affected (0.250 sec)
+Query OK, 103 rows affected (0.252 sec)
 Records: 103  Duplicates: 0  Warnings: 0
 
 --------------
@@ -174,7 +174,7 @@
 +----------+
 |      103 |
 +----------+
-1 row in set (0.018 sec)
+1 row in set (0.022 sec)
 
 --------------
 insert into idb_lineitem (l_orderkey) select o_orderkey from orders, myisam_lineitem where l_orderkey = o_orderkey
@@ -198,20 +198,20 @@
 rollback
 --------------
 
-Query OK, 0 rows affected (0.195 sec)
+Query OK, 0 rows affected (0.202 sec)
 
 --------------
 insert low_priority idb_lineitem (l_orderkey) select l_orderkey from lineitem limit 1000
 --------------
 
-Query OK, 1000 rows affected (0.255 sec)
+Query OK, 1000 rows affected (0.251 sec)
 Records: 1000  Duplicates: 0  Warnings: 0
 
 --------------
 insert ignore idb_lineitem (l_orderkey) select l_orderkey from lineitem  where l_orderkey < 100 order by 1 ON DUPLICATE KEY UPDATE l_orderkey = 5
 --------------
 
-Query OK, 105 rows affected (0.257 sec)
+Query OK, 105 rows affected (0.252 sec)
 Records: 105  Duplicates: 0  Warnings: 0
 
 --------------
@@ -229,6 +229,6 @@
 rollback
 --------------
 
-Query OK, 0 rows affected (0.270 sec)
+Query OK, 0 rows affected (0.261 sec)
 
 Bye

Here is the diff b/w 1.4.3-5 and current develop

drop table if exists myisam_lineitem
 --------------
 
-Query OK, 0 rows affected, 1 warning (0.000 sec)
+Query OK, 0 rows affected (0.005 sec)
 
 --------------
 drop table if exists idb_lineitem
 --------------
 
-Query OK, 0 rows affected, 1 warning (0.000 sec)
+Query OK, 0 rows affected (0.330 sec)
 
 --------------
 create table idb_lineitem (
@@ -31,7 +31,7 @@
 ) engine=columnstore
 --------------
 
-Query OK, 0 rows affected (0.603 sec)
+Query OK, 0 rows affected (0.522 sec)
 
 --------------
 set autocommit = 0
@@ -43,7 +43,7 @@
 insert into idb_lineitem select * from lineitem limit 1000
 --------------
 
-Query OK, 1000 rows affected (0.477 sec)
+Query OK, 1000 rows affected (0.408 sec)
 Records: 1000  Duplicates: 0  Warnings: 0
 
 --------------
@@ -55,13 +55,13 @@
 +----------+
 |     1000 |
 +----------+
-1 row in set (0.032 sec)
+1 row in set (0.026 sec)
 
 --------------
 insert idb_lineitem (l_orderkey, l_partkey) select * from (select l_orderkey, l_partkey from lineitem, orders where l_orderkey=o_orderkey and l_orderkey<1000) sub
 --------------
 
-Query OK, 1004 rows affected (0.321 sec)
+Query OK, 1004 rows affected (0.309 sec)
 Records: 1004  Duplicates: 0  Warnings: 0
 
 --------------
@@ -73,13 +73,13 @@
 +----------+
 |     2004 |
 +----------+
-1 row in set (0.010 sec)
+1 row in set (0.009 sec)
 
 --------------
 insert into idb_lineitem (l_shipdate, l_partkey) (select  now(), l_partkey from lineitem where l_orderkey <= 1000 order by l_orderkey limit 100)
 --------------
 
-Query OK, 100 rows affected, 100 warnings (0.272 sec)
+Query OK, 100 rows affected, 100 warnings (0.264 sec)
 Records: 100  Duplicates: 0  Warnings: 100
 
 --------------
@@ -91,13 +91,13 @@
 +----------+
 |     2104 |
 +----------+
-1 row in set (0.010 sec)
+1 row in set (0.012 sec)
 
 --------------
 rollback
 --------------
 
-Query OK, 0 rows affected (0.507 sec)
+Query OK, 0 rows affected (0.497 sec)
 
 --------------
 create table myisam_lineitem (
@@ -120,13 +120,13 @@
 )
 --------------
 
-Query OK, 0 rows affected (0.011 sec)
+Query OK, 0 rows affected (0.010 sec)
 
 --------------
 insert into myisam_lineitem select * from lineitem where l_orderkey <= 500 and l_linenumber = 1
 --------------
 
-Query OK, 127 rows affected (0.063 sec)
+Query OK, 127 rows affected (0.068 sec)
 Records: 127  Duplicates: 0  Warnings: 0
 
 --------------
@@ -144,7 +144,7 @@
 insert into myisam_lineitem (l_orderkey) select count(*) from lineitem group by l_linenumber order by l_linenumber
 --------------
 
-Query OK, 7 rows affected (0.111 sec)
+Query OK, 7 rows affected (0.094 sec)
 Records: 7  Duplicates: 0  Warnings: 0
 
 --------------
@@ -162,7 +162,7 @@
 insert idb_lineitem (l_orderkey, l_shipdate) (select o_orderkey, o_orderdate from orders where o_orderkey <= 400)
 --------------
 
-Query OK, 103 rows affected (0.250 sec)
+Query OK, 103 rows affected (0.248 sec)
 Records: 103  Duplicates: 0  Warnings: 0
 
 --------------
@@ -174,13 +174,13 @@
 +----------+
 |      103 |
 +----------+
-1 row in set (0.018 sec)
+1 row in set (0.014 sec)
 
 --------------
 insert into idb_lineitem (l_orderkey) select o_orderkey from orders, myisam_lineitem where l_orderkey = o_orderkey
 --------------
 
-Query OK, 0 rows affected (0.009 sec)
+Query OK, 0 rows affected (0.007 sec)
 Records: 0  Duplicates: 0  Warnings: 0
 
 --------------
@@ -198,20 +198,20 @@
 rollback
 --------------
 
-Query OK, 0 rows affected (0.195 sec)
+Query OK, 0 rows affected (0.193 sec)
 
 --------------
 insert low_priority idb_lineitem (l_orderkey) select l_orderkey from lineitem limit 1000
 --------------
 
-Query OK, 1000 rows affected (0.255 sec)
+Query OK, 1000 rows affected (0.254 sec)
 Records: 1000  Duplicates: 0  Warnings: 0
 
 --------------
 insert ignore idb_lineitem (l_orderkey) select l_orderkey from lineitem  where l_orderkey < 100 order by 1 ON DUPLICATE KEY UPDATE l_orderkey = 5
 --------------
 
-Query OK, 105 rows affected (0.257 sec)
+Query OK, 105 rows affected (0.286 sec)
 Records: 105  Duplicates: 0  Warnings: 0
 
 --------------
@@ -229,6 +229,6 @@
 rollback
 --------------
 
-Query OK, 0 rows affected (0.270 sec)
+Query OK, 0 rows affected (0.264 sec)
 
 Bye

Comment by Roman [ 2020-05-04 ]

I need to confess I've seen almost the same 2x timings running insert_select.sql today and it magically disappears on its own so this must be a cloud.

Comment by Daniel Lee (Inactive) [ 2020-05-05 ]

I tested this a VM on my local server, the same way as I did other tests for this ticket. I made sure the test VM is he only VM exist on the host.

Generated at Thu Feb 08 02:46:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.