[MCOL-4328] New segments files created as the result of bulk insertion with cpimport belongs to root Created: 2020-09-28  Updated: 2021-09-17  Resolved: 2020-10-09

Status: Closed
Project: MariaDB ColumnStore
Component/s: cpimport
Affects Version/s: 1.5.3
Fix Version/s: 5.4.1

Type: Bug Priority: Blocker
Reporter: Roman Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
causes MCOL-4870 The -U option is missing from cpimpor... Open
is caused by MCOL-4012 Enable ColumnStore to run as a non ro... Closed

 Description   

The files created by bulk insertions with cpimport belongs to root so any read operations fails making tables unusable.



 Comments   
Comment by Roman [ 2020-10-01 ]

Plz review.

Comment by Roman [ 2020-10-03 ]

Plz review.

Comment by Roman [ 2020-10-05 ]

4 QA. One can use 866 build before the nightly will be available.

cpimport/.bin now has an additional option to set segment and dictionary files owner -U. mysql is the owner by default. All auxilary files, e.g. .job and .log files belongs to the user that runs cpimport.

One can run cpimport/.bin using a non-privileged user,e.g. mysql but this operation mode puts an additional requirenment, non-privileged user must have privileges to create files in cwd of the cpimport.

To test this issue one should upload data set with cardinality >= 16,000,000 records couple times to test that created files and directories belongs to mysql.
I would test that cpimport in mode 3 operates fine also.

Comment by Daniel Lee (Inactive) [ 2020-10-05 ]

Build tested: 5.4.1-1 (Drone #868)

cpimport mode 2 also not working

I executed the following cpimport using root root from PM1.

[centos8:root~]# /usr/bin/cpimport -m 2 mytest lineitem -l lineitem.tbl
2020-10-05 19:32:19 (8459) INFO : Running distributed import (mode 2) on all PMs...
2020-10-05 19:32:21 (8459) ERR : Received a Cpimport Failure from PM2
2020-10-05 19:32:21 (8459) INFO : Please verify error log files in PM2
2020-10-05 19:32:21 (8459) INFO : Canceling outstanding cpimports
2020-10-05 19:32:21 (8459) ERR : Received a Cpimport Failure from PM3
2020-10-05 19:32:21 (8459) INFO : Please verify error log files in PM3
2020-10-05 19:32:21 (8459) INFO : Canceling outstanding cpimports
2020-10-05 19:32:21 (8459) ERR : Received a Cpimport Failure from PM1
2020-10-05 19:32:21 (8459) INFO : Please verify error log files in PM1
2020-10-05 19:32:21 (8459) INFO : Table mytest.lineitem: (OID-3017) was NOT successfully loaded.
2020-10-05 19:32:21 (8459) INFO : Bulk load completed, total run time : 2.13427 seconds

err.log entries

this is in the err.log file: Oct 5 19:04:47 centos-8 cpimport.bin[6700]: 47.651458 |0|0|0| E 34 CAL0087: BulkLoad Error: Error gaining r/w access to import path .: Permission denied
Oct 5 19:04:49 centos-8 writeengineserver[4552]: 49.611008 |0|0|0| E 32 CAL0000: 7738 : cpimport exit on failure (signal -1)

Logged as root user, all directories in the following path:
/var/lib/columnstore/data/bulk/data/import/lineitem has "drwxr-xr-x" permission and mysql:mysql ownership.

In /var/lib/columnstore, I did "chmod -R 777 data", cpimport still failed.

I noticed cpimport.bin on remote nodes runs under the mysql user.

Comment by Daniel Lee (Inactive) [ 2020-10-05 ]

Build tested: 5.4.1-1 (Drone #868)

As reported by another engineer, cpimport on a cluster using S3 failed. It failed on both single and multi -node stacks.

[centos8:root~]# /usr/bin/cpimport mytest lineitem /data/qa/source/dbt3/1g/lineitem.tbl
2020-10-05 21:27:47 (6768) INFO : Running distributed import (mode 1) on all PMs...
2020-10-05 21:27:49 (6768) ERR : Received a Cpimport Failure from PM2
2020-10-05 21:27:49 (6768) INFO : Please verify error log files in PM2
2020-10-05 21:27:49 (6768) INFO : Canceling outstanding cpimports
2020-10-05 21:27:49 (6768) ERR : Received a Cpimport Failure from PM3
2020-10-05 21:27:49 (6768) INFO : Please verify error log files in PM3
2020-10-05 21:27:49 (6768) INFO : Canceling outstanding cpimports
2020-10-05 21:27:49 (6768) ERR : Received a Cpimport Failure from PM1
2020-10-05 21:27:49 (6768) INFO : Please verify error log files in PM1
2020-10-05 21:27:49 (6768) INFO : Table mytest.lineitem: (OID-3017) was NOT successfully loaded.
2020-10-05 21:27:49 (6768) INFO : Bulk load completed, total run time : 3.2574 seconds

err.log

[centos8:root~]# cat err.log
Oct 5 21:23:21 centos-8 cpimport.bin[6494]: 21.466442 |0|0|0| E 34 CAL0087: BulkLoad Error: Error calling chown() with uid 991 and gid 988 with the file /var/lib/columnstore/data1/bulkRollback/3017.tmp with errno 2
Oct 5 21:23:21 centos-8 cpimport.bin[6494]: 21.466514 |0|0|0| E 34 CAL0087: BulkLoad Error: Error in pre-processing the job file for table mytest.lineitem
Oct 5 21:23:21 centos-8 writeengineserver[6004]: 21.474489 |0|0|0| E 32 CAL0000: pushing data : PIPE error .........Broken pipe
Oct 5 21:23:23 centos-8 writeengineserver[6004]: 23.380798 |0|0|0| E 32 CAL0000: 6567 : cpimport exit on failure (signal -1)
Oct 5 21:23:23 centos-8 writeenginesplit[6459]: 23.381700 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM1#033[0m
Oct 5 21:23:23 centos-8 writeenginesplit[6459]: 23.381808 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM1#033[0m
Oct 5 21:23:23 centos-8 writeenginesplit[6459]: 23.381875 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM2#033[0m
Oct 5 21:23:23 centos-8 writeenginesplit[6459]: 23.381915 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM2#033[0m
Oct 5 21:23:23 centos-8 writeenginesplit[6459]: 23.381955 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM3#033[0m
Oct 5 21:23:23 centos-8 writeenginesplit[6459]: 23.381991 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM3#033[0m
Oct 5 21:23:35 centos-8 cpimport.bin[6590]: 35.288196 |0|0|0| E 34 CAL0087: BulkLoad Error: Error calling chown() with uid 991 and gid 988 with the file /var/lib/columnstore/data1/bulkRollback/3004.tmp with errno 2
Oct 5 21:23:35 centos-8 cpimport.bin[6590]: 35.288307 |0|0|0| E 34 CAL0087: BulkLoad Error: Error in pre-processing the job file for table mytest.orders
Oct 5 21:23:35 centos-8 writeengineserver[6004]: 35.298894 |0|0|0| E 32 CAL0000: pushing data : PIPE error .........Broken pipe
Oct 5 21:23:35 centos-8 writeengineserver[6004]: 35.602644 |0|0|0| E 32 CAL0000: 6371 : cpimport exit on failure (signal -1)
Oct 5 21:23:35 centos-8 writeenginesplit[6560]: 35.605814 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM1#033[0m
Oct 5 21:23:35 centos-8 writeenginesplit[6560]: 35.606379 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM1#033[0m
Oct 5 21:23:35 centos-8 writeenginesplit[6560]: 35.641390 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM2#033[0m
Oct 5 21:23:35 centos-8 writeenginesplit[6560]: 35.641746 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM2#033[0m
Oct 5 21:23:35 centos-8 writeenginesplit[6560]: 35.643410 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM3#033[0m
Oct 5 21:23:35 centos-8 writeenginesplit[6560]: 35.643806 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM3#033[0m
Oct 5 21:27:47 centos-8 cpimport.bin[6796]: 47.501241 |0|0|0| E 34 CAL0087: BulkLoad Error: Error calling chown() with uid 991 and gid 988 with the file /var/lib/columnstore/data1/bulkRollback/3017.tmp with errno 2
Oct 5 21:27:47 centos-8 cpimport.bin[6796]: 47.501312 |0|0|0| E 34 CAL0087: BulkLoad Error: Error in pre-processing the job file for table mytest.lineitem
Oct 5 21:27:47 centos-8 writeengineserver[6004]: 47.518460 |0|0|0| E 32 CAL0000: pushing data : PIPE error .........Broken pipe
Oct 5 21:27:49 centos-8 writeenginesplit[6768]: 49.893161 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM2#033[0m
Oct 5 21:27:49 centos-8 writeenginesplit[6768]: 49.893758 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM2#033[0m
Oct 5 21:27:49 centos-8 writeengineserver[6004]: 49.927031 |0|0|0| E 32 CAL0000: 4508 : cpimport exit on failure (signal -1)
Oct 5 21:27:49 centos-8 writeenginesplit[6768]: 49.927837 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM3#033[0m
Oct 5 21:27:49 centos-8 writeenginesplit[6768]: 49.929069 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM3#033[0m
Oct 5 21:27:49 centos-8 writeenginesplit[6768]: 49.931473 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM1#033[0m
Oct 5 21:27:49 centos-8 writeenginesplit[6768]: 49.932019 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM1#033[0m
[centos8:root~]#

Comment by Roman [ 2020-10-06 ]

4QA Mode 2 works for me. It is important to explicitly set a source directory that is owned by mysql user on all remote nodes. Here is the example if cpimport command and /data belongs to mysql user. /data/output1.txt is the source file. This must be documented for sure.:
cpimport -d3 -m2 test cs1 -f /data/ -l output1.txt

root@ip-172-31-4-84:/data# cpimport -d3 -m2 test cs1 -f /data/ -l output1.txt
 
Debug level set to 3
ModuleType pm
InstallType 2
PM Count 2
LogName : /var/lib/columnstore/data/bulk/log/cpimport_1006120230_25798.log
ErrLogName : /var/lib/columnstore/data/bulk/log/cpimport_1006120230_25798.err
Adding PmId - 1
Adding PmId - 2
Table OID = 3006
aWaitPeriod = 10
Table Lock = 24
PmId = 1
PmId = 2
setting connection to moduleid 1
WEServer : pm1_WriteEngineServer 172.31.4.84
setting connection to moduleid 2
WEServer : pm2_WriteEngineServer 172.31.14.224
2020-10-06 12:02:30 (25798) INFO : Running distributed import (mode 2) on all PMs...
Inside WESplitterApp::processMessages() Mode = 2
CPImport cmd line - cpimport.bin -f /data/ -d 3 -R /tmp/columnstore_tmp_files/BrmRpt090612023025798.rpt -m 2 -P pm1-25798 -T SYSTEM -u6860fd33-d170-4d4a-9d4b-8bacbbd2cedc test cs1 output1.txt
BrmReport FileName - /tmp/columnstore_tmp_files/BrmRpt090612023025798.rpt
Received a Keep Alive from 1
Received a Keep Alive from 2
No of DBRoots in PM2 = 1
On Start CPI response arrived 2
No of DBRoots in PM1 = 1
On Start CPI response arrived 1
Received a Keep Alive from 1
Received a Keep Alive from 2
Received a Keep Alive from 1
Received a Keep Alive from 2
Received a Keep Alive from 1
Received a Keep Alive from 2
Received a Keep Alive from 1
Received a Keep Alive from 2
Received a Keep Alive from 1
Received a Keep Alive from 2
Received a Keep Alive from 1
Received a BRM-Report from 1
BRM Report length : 456
BRM Report value : CP: 381952 2147483594 229 -1 9 1
BRM Report value : CP: 349184 2147483516 626 -1 9 0
BRM Report value : CP: 447488 2147482359 277 -1 9 1
BRM Report value : CP: 398336 2147482757 547 -1 9 1
BRM Report value : CP: 357376 -9223372036854775808 -9223372036854775808 -1 24 0
BRM Report value : HWM: 3007 0 1 16383
BRM Report value : HWM: 3007 0 2 16383
BRM Report value : HWM: 3007 1 0 8191
BRM Report value : HWM: 3007 1 2 5916
BRM Report value : HWM: 3008 0 1 16383
BRM Report value : HWM: 3008 0 2 16383
BRM Report value : HWM: 3008 1 0 8191
BRM Report value : HWM: 3008 1 2 5916
BRM Report value : ROWS: 24000000 24000000
BRM Report value : DATA: 0 9 3007 0
BRM Report value : DATA: 1 24 3008 0
BRMReport from 2 still not received
Still Brm Reports to come in!!
Received a Cpimport Pass from PM1
Received a Keep Alive from 2
Received a Keep Alive from 1
Received a Keep Alive from 2
Received a BRM-Report from 2
BRM Report length : 456
BRM Report value : CP: 365568 2147483594 229 -1 9 1
BRM Report value : CP: 332800 2147483516 626 -1 9 0
BRM Report value : CP: 472064 2147482359 277 -1 9 1
BRM Report value : CP: 422912 2147482757 547 -1 9 1
BRM Report value : CP: 340992 -9223372036854775808 -9223372036854775808 -1 24 0
BRM Report value : HWM: 3007 0 0 16383
BRM Report value : HWM: 3007 0 3 16383
BRM Report value : HWM: 3007 1 1 8191
BRM Report value : HWM: 3007 1 3 5916
BRM Report value : HWM: 3008 0 0 16383
BRM Report value : HWM: 3008 0 3 16383
BRM Report value : HWM: 3008 1 1 8191
BRM Report value : HWM: 3008 1 3 5916
BRM Report value : ROWS: 24000000 24000000
BRM Report value : DATA: 0 9 3007 0
BRM Report value : DATA: 1 24 3008 0
Inside updateCPAndHWMInBRM()
Finished prepareCasualPartitionInfo()!!
prepareHighWaterMarkInfo() finished
Updating High Water Mark Successful!!
 
BRM updated successfully 
	Successfully changed TableLock State
A cleanup is called!!
Received a Cpimport Pass from PM2
Cleanup rslt arrived PmId = 2 Rslt = 1
Cleanup rslt arrived PmId = 1 Rslt = 1
Released Table Lock
Cleanup succeed on all PMs
2020-10-06 12:02:38 (25798) INFO : For table test.cs1: 48000000 rows processed and 48000000 rows inserted.
2020-10-06 12:02:38 (25798) INFO : Bulk load completed, total run time : 8.03274 seconds
cleaning up Client threads 
Bulk load Finished Successfully on PM 1
Received a Keep Alive from 1
Received a Keep Alive from 2
	PMid      	1
	Tx Rows   	0
	Tx Bytes  	0
	Rcv Bytes 	486
	Inserted/Read Rows   24000000/24000000
	Col Id	ColName			out-of-range count
	0	test.cs1.i		0
	1	test.cs1.t		0
	(9sec)
	
Bulk load Finished Successfully on PM 2
	PMid      	2
	Tx Rows   	0
	Tx Bytes  	0
	Rcv Bytes 	486
	Inserted/Read Rows   24000000/24000000
	Col Id	ColName			out-of-range count
	0	test.cs1.i		0
	1	test.cs1.t		0
	(9sec)
	
Calling WESplitterApp Destructor
 
root@ip-172-31-4-84:/data# 

Comment by Daniel Lee (Inactive) [ 2020-10-06 ]

Your test case is a bit different from mine. I did not specify the -f option so the default directory (/var/lib/columnstore/data/bulk/data/import) was used. The mysql does not have access to /var/lib. Therefore, even I granted permission from ./columnstore and down, cpimport still would fail.

I did verify your test case and it worked.

If -f is required, we will need to document that. Also change the Example4 in the help text to include the -f option.

Comment by Daniel Lee (Inactive) [ 2020-10-06 ]

Please see may comments

Also it has the S3 issue.

Comment by Daniel Lee (Inactive) [ 2020-10-08 ]

Build tested: 5.4.1-1 (Drone #495)

Single node
Performed cpimport of a 10g lineitem source data. PASSED

Single node with S3
Performance cpimport of 1g lineitem 4 times. PASSED

Multi-node (3PM), cpimport mode 2
If source data in a directory that the mysql user has read/write access, such as /tmp, PASSED

Multi-node (3PM), cpimport mode 2, -f option
-f option is not being enforced for mode 2. Without it, cpimport FAILED

This is has been reported in Drone build #868 and it still exist in this build.

[centos7:root~]# /usr/bin/cpimport -m2 mytest lineitem -l lineitem.tbl
2020-10-08 21:27:38 (15140) INFO : Running distributed import (mode 2) on all PMs...
2020-10-08 21:27:40 (15140) ERR : Received a Cpimport Failure from PM1
2020-10-08 21:27:40 (15140) INFO : Please verify error log files in PM1
2020-10-08 21:27:40 (15140) INFO : Canceling outstanding cpimports
2020-10-08 21:27:40 (15140) ERR : Received a Cpimport Failure from PM3
2020-10-08 21:27:40 (15140) INFO : Please verify error log files in PM3
2020-10-08 21:27:40 (15140) INFO : Canceling outstanding cpimports
2020-10-08 21:27:40 (15140) ERR : Received a Cpimport Failure from PM2
2020-10-08 21:27:40 (15140) INFO : Please verify error log files in PM2
2020-10-08 21:27:40 (15140) INFO : Table mytest.lineitem: (OID-3017) was NOT successfully loaded.
2020-10-08 21:27:40 (15140) INFO : Bulk load completed, total run time : 2.13983 seconds

err..og

Oct 8 21:27:38 localhost cpimport.bin[15166]: 38.851141 |0|0|0| E 34 CAL0087: BulkLoad Error: Error gaining r/w access to import path .: Permission denied
Oct 8 21:27:40 localhost writeengineserver[14542]: 40.858234 |0|0|0| E 32 CAL0000: 7196 : cpimport exit on failure (signal -1)
Oct 8 21:27:40 localhost writeenginesplit[15140]: 40.858560 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM1#033[0m
Oct 8 21:27:40 localhost writeenginesplit[15140]: 40.858687 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM1#033[0m
Oct 8 21:27:40 localhost writeenginesplit[15140]: 40.859317 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM3#033[0m
Oct 8 21:27:40 localhost writeenginesplit[15140]: 40.859394 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM3#033[0m
Oct 8 21:27:40 localhost writeenginesplit[15140]: 40.860370 |0|0|0| E 33 CAL0000: #033[0;31mReceived a Cpimport Failure from PM2#033[0m
Oct 8 21:27:40 localhost writeenginesplit[15140]: 40.860455 |0|0|0| E 33 CAL0087: BulkLoad Error: #033[0;31mReceived a Cpimport Failure from PM2#033[0m

Comment by Roman [ 2020-10-09 ]

4QA Now mode 2 calls for an explicit -f option.

Comment by Daniel Lee (Inactive) [ 2020-10-09 ]

Build verified: 5.4.1-1 ( Drone #904 )

Verified on a 3PM cluster on Centos 8

[centos8:root~]# /usr/bin/cpimport -m2 mytest lineitem -l lineitem.tbl
-f option is mandatory with mode 2.
Try 'cpimport -h' for more information.

[centos8:root~]# /usr/bin/cpimport -m2 mytest lineitem -f /tmp -l lineitem.tbl
2020-10-09 16:30:26 (6172) INFO : Running distributed import (mode 2) on all PMs...
2020-10-09 16:30:46 (6172) INFO : For table mytest.lineitem: 18003645 rows processed and 18003645 rows inserted.
2020-10-09 16:30:46 (6172) INFO : Bulk load completed, total run time : 20.2081 seconds

Also tested on modes 1 and 3, as well as no mode specified, defaulted to mode 1

Generated at Thu Feb 08 02:49:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.