[MCOL-545] Bulk operation produce IO on UM Created: 2017-02-06  Updated: 2020-04-07  Resolved: 2020-04-02

Status: Closed
Project: MariaDB ColumnStore
Component/s: cpimport
Affects Version/s: 1.0.6
Fix Version/s: N/A

Type: New Feature Priority: Major
Reporter: VAROQUI Stephane Assignee: Andrew Hutchings (Inactive)
Resolution: Duplicate Votes: 0
Labels: performance

Issue Links:
Duplicate
is duplicated by MCOL-3903 Performance regression in INSERT..SELECT Closed

 Description   

Insert Select of a CS table to a copy CS table is minimum 2 times slower than equivalent mysql -q -e "select * from table " | cpimport

We notice a big io usage on the UM that is not seen in the piping method



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2017-02-06 ]

Internally this does the following:

  1. MariaDB tells ColumnStore to start a bulk write operation (INSERT_SELECT)
  2. ColumnStore Engine spins up an instance of cpimport in piped mode 1.
  3. MariaDB sends the storage engine row by row the data (blocking calls)
  4. The ColumnStore engine converts this binary format into text CSV
  5. The CSV row is piped into cpimport

Now, the problem is that the processing of binary to CSV format blocks the engine from getting the next row from MariaDB (causing the performance difference). This could be solved by having a FIFO buffer which the write_row call stores the binary row data into and a thread does the CSV conversion and pipes into cpimport.

A change such as this should probably wait until we have a cpimport API so we can clean up this code. Ideally the API would support a binary format as well as CSV so that the double conversion (binary->CSV->binary) isn't required.

In the mean time there is a mode which uses direct bulk insert instead of cpimport. I've not tried this, it might be slightly faster (probably not) but there might be dragons. This is the system variable to toggle it: infinidb_use_import_for_batchinsert

Comment by David Hall (Inactive) [ 2017-02-06 ]

Direct bulk insert is significantly slower in almost all but trivial cases. But no dragons that I know of.

Comment by Todd Stoffel (Inactive) [ 2020-04-02 ]

https://jira.mariadb.org/browse/MCOL-3903

Generated at Thu Feb 08 02:21:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.