[MCOL-912] After adding two PMs, cpimport failed on newly added PMs Created: 2017-09-13  Updated: 2018-04-17  Resolved: 2018-04-17

Status: Closed
Project: MariaDB ColumnStore
Component/s: cpimport
Affects Version/s: 1.0.11, 1.1.0
Fix Version/s: 1.1.4

Type: Bug Priority: Major
Reporter: Daniel Lee (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 1
Labels: relnote

Sprint: 2017-18, 2017-19, 2017-20, 2017-21, 2018-04, 2018-05, 2018-06, 2018-07, 2018-08

 Description   

Build tested: 1.1.0-1 beta

Started with a 1um2pm gluster stack, added two more modules, started system, then create lineitem table, then run cpimport of a 10g lineitem file.

[root@localhost bin]# ./cpimport mytest lineitem /data/qa/source/dbt3/10g/lineitem.tbl
2017-09-13 15:08:45 (10100) INFO : Running distributed import (mode 1) on all PMs...
2017-09-13 15:08:52 (10100) ERR : Received a Cpimport Failure from PM4
2017-09-13 15:08:52 (10100) INFO : Please verify error log files in PM4
2017-09-13 15:08:52 (10100) INFO : Canceling outstanding cpimports
2017-09-13 15:08:52 (10100) ERR : PM4 : Bulkload Parse (thread 1) Failed for Table mytest.lineitem during parsing. Terminating this job.
2017-09-13 15:08:52 (10100) ERR : PM4 : Bulkload Parse (thread 0) Failed for Table mytest.lineitem during parsing. Terminating this job.
2017-09-13 15:08:52 (10100) ERR : PM4 : Bulkload Parse (thread 2) Failed for Table mytest.lineitem during parsing. Terminating this job.
2017-09-13 15:08:52 (10100) ERR : Received a Cpimport Failure from PM3
2017-09-13 15:08:52 (10100) INFO : Please verify error log files in PM3
2017-09-13 15:08:52 (10100) INFO : Canceling outstanding cpimports
2017-09-13 15:08:52 (10100) ERR : PM3 : Bulkload Parse (thread 1) Failed for Table mytest.lineitem during parsing. Terminating this job.
2017-09-13 15:08:52 (10100) ERR : PM3 : Bulkload Parse (thread 2) Failed for Table mytest.lineitem during parsing. Terminating this job.
2017-09-13 15:08:52 (10100) ERR : PM3 : Bulkload Parse (thread 0) Failed for Table mytest.lineitem during parsing. Terminating this job.
2017-09-13 15:08:59 (10100) INFO : Table mytest.lineitem: (OID-3017) was NOT successfully loaded.
2017-09-13 15:08:59 (10100) INFO : Bulk load completed, total run time : 13.9637 seconds

cpimport failed on the newly added PM3 and PM4.

According to the err.log on pm3, it could not find the extend map entries for some OIDs (I forgot to save the log file).

I did a shutdown system and started it again, cpimport was successful.

Additional info:

After creating the lineitem table, I check the extent map. The initial extent, which is the only extent, was on dbroot3. This is because I created couple other tables for my default initial test. I mentioned this, just in case other people could not reproduce the issue.



 Comments   
Comment by David Thompson (Inactive) [ 2017-09-13 ]

workaround: requires a restartSystem after adding new modules. Online add for gluster not supported until this fixed.

Comment by David Hill (Inactive) [ 2017-11-02 ]

so the thought process was to try to leave the system up where they can addmodule, then bring it into the system without much downtime. Now Im starting to think it would be best and safer to enforce that the system be stopped during the addmodule/adddbroot phase... then startsystem will bring everything up with the new configuration... more down time, but will be alot safer... I don't think too many customer would complain about this change...
my thoughts..

Comment by Ben Thompson (Inactive) [ 2017-11-02 ]

This appears to be non DR issue also, following the instructions in KB here:
https://mariadb.com/kb/en/library/managing-columnstore-module-configurations/#example-command-to-add-a-performance-module-and-dbroot-to-a-active-system

#mcsadmin addModule pm 1 hostnamePm3 'password'
#mcsadmin addDBroot 1
#mcsadmin alterSystem-EnableModule pm3
#mcsadmin assignDbrootPmConfig 3 pm3
#mcsadmin startSystem

leads to same failure.

Comment by Ben Thompson (Inactive) [ 2017-11-02 ]

Fails in 1.0.11 also

Comment by David Hill (Inactive) [ 2018-04-06 ]

earlier testing, looks like these need to be restarted:

dbrm processes
DMLProc

Comment by David Hill (Inactive) [ 2018-04-09 ]

https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/437

Comment by David Hill (Inactive) [ 2018-04-09 ]

as part of this fix, I updated the addmodule/dbroot document with this statement

All Database Updates commands like DDL/DML and cpimports are suspend until the module or DBRoot is successfully added

https://mariadb.com/kb/en/library/managing-columnstore-module-configurations/

Comment by David Hill (Inactive) [ 2018-04-09 ]

to test

1. start with a system with 1 or more pms
2. perform cpimport (I tested by importing 1gb of tpch data)
3. add new pm and dbroot assigned with that mode
4. enable and start system
5. do tail on debug logs on um and new pm
6. run cpimport again and make sure logs show successfull on um and newly added pm

Comment by Ben Thompson (Inactive) [ 2018-04-16 ]

Reviewed / Merged

Comment by Daniel Lee (Inactive) [ 2018-04-17 ]

Build verified: 1.1.4-1 source

/root/columnstore/mariadb-columnstore-server
commit 6b8a6745bd84b0230875fb94b526d9426ba999f7
Merge: 5199dd1 2089aad
Author: benthompson15 <ben.thompson@mariadb.com>
Date: Mon Apr 16 18:50:35 2018 -0500

Merge pull request #110 from mariadb-corporation/MCOL-1293

MCOL-1293

/root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
commit 3e8fed91704effff5e442148916d1c3611dd38c4
Merge: 1586394 f2d748c
Author: benthompson15 <ben.thompson@mariadb.com>
Date: Mon Apr 16 18:50:05 2018 -0500

Merge pull request #447 from mariadb-corporation/MCOL-1293

MCOL-1293

Started with a 1um2pm stack, added and enabled two PMs, assigned dbroots and started system. 10g cpimport was successful.

Generated at Thu Feb 08 02:24:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.