[MCOL-1183] Incorrectly formatted file can cause cpimport to crash and leave behind locks Created: 2018-01-26  Updated: 2019-01-18  Resolved: 2019-01-07

Status: Closed
Project: MariaDB ColumnStore
Component/s: cpimport
Affects Version/s: 1.1.2
Fix Version/s: 1.1.7

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Jens Röwekamp (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None

Sprint: 2018-05, 2018-06, 2018-07, 2018-08, 2018-09, 2018-10, 2018-11, 2018-12, 2018-13, 2018-14, 2018-15, 2018-16, 2018-17, 2018-18, 2018-19, 2018-20, 2018-21

 Description   

A user tried to load some incorrectly formatted tab-delimited files using cpimport.

gdb says that the crash was the following:

Program terminated with signal 11, Segmentation fault.
#0  0x00007f0e40940cf4 in WriteEngine::WEFileReadThread::getNextRow (this=this@entry=0x7fff11bb0018, ifs=...,
    pBuf=pBuf@entry=0x7fff11bb03a4 "...redacted..."..., MaxLen=MaxLen@entry=1048575)
    at /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/writeengine/splitter/we_filereadthread.cpp:495

After cpimport crashed, ColumnStore also did not clear the table locks originally taken by cpimport, so those had to be cleared with cleartablelock.

Core dumps and log files will be provided privately.



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2018-01-29 ]

Many thanks for the gdb output. Whilst I haven't been able to mock up a way of reproducing this yet it does look like the buffer is blown when scanning for the end enclosing character once one is found.

The user is outputting data using:

mysql --defaults-extra-file=<.cnf file> -h <server> --compress --quick --skip-column-names --default-character-set=utf8 -Ns -D $schema --execute="select * from $table" > $table.txt

Then importing with:

cpimport -E '\"' -e 1 -n1 -s '\t' $schema $table $file

The user has been told to use -B on the export instead.

My guess is we need to create a table that will spit out at least 1MB of data once a quote is in the output so that there is enough data to blow the buffer.

Generated at Thu Feb 08 02:26:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.