[MCOL-3319] Force system shutdown is done without rollback and system fails after restarting Created: 2019-05-21  Updated: 2023-03-06  Resolved: 2023-03-06

Status: Closed
Project: MariaDB ColumnStore
Component/s: cpimport
Affects Version/s: 1.2.4
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Zdravelina Sokolovska (Inactive) Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Environment:

1UM, 2PMs ; mcs run on CentOS7


Attachments: Text File cpimport_logs.txt     Text File logs_PM_1.txt     Text File logs_PM_2.txt     Text File logs_UM1.txt    

 Description   

Force system shutdown is done without rollback and system fails after restarting

expected : system is restarted successfully

how to repeat:
Initiate simultaneous concurrent load with cpimport in mode m3 on both PM1 ans PMs
Issue shutdown with the Force otion-- table locks are detected but system gets down without
performing rollback
Start system – rollback is tried to be done upon starting system but on the first attempt DMLProc remained failed

Note: After stopping, staring system several times (that might be used as workaround but it's not totally suitable ) rollback completed and mcs system was getting in Active State;

mcsadmin shut
shutdownsystem   Tue May 21 11:37:12 2019
 
This command stops the processing of applications on all Modules within the MariaDB ColumnStore System
 
   Checking for active transactions
The following tables are locked:
LockID Name                       Process            PID   Session  CreationTime           State    DBRoots
79     tpcds_1000.catalog_returns cpimport.bin (pm1) 1884  BulkLoad 2019-05-21 11:13:50 AM Abandoned1
80     tpcds_1000.catalog_returns cpimport.bin (pm2) 32606 BulkLoad 2019-05-21 11:13:53 AM LOADING  2
81     tpcds_1000.catalog_sales   cpimport.bin (pm2) 32635 BulkLoad 2019-05-21 11:13:59 AM LOADING  2
82     tpcds_1000.catalog_sales   cpimport.bin (pm1) 2258  BulkLoad 2019-05-21 11:14:05 AM Abandoned1
105    tpcds_1000.store_returns   cpimport.bin (pm1) 4625  BulkLoad 2019-05-21 11:15:08 AM Abandoned1
106    tpcds_1000.store_returns   cpimport.bin (pm2) 638   BulkLoad 2019-05-21 11:15:11 AM LOADING  2
107    tpcds_1000.store_sales     cpimport.bin (pm1) 4821  BulkLoad 2019-05-21 11:15:14 AM Abandoned1
108    tpcds_1000.store_sales     cpimport.bin (pm2) 685   BulkLoad 2019-05-21 11:15:18 AM LOADING  2
119    tpcds_1000.web_sales       cpimport.bin (pm1) 6037  BulkLoad 2019-05-21 11:15:50 AM Abandoned1
120    tpcds_1000.web_sales       cpimport.bin (pm2) 899   BulkLoad 2019-05-21 11:15:53 AM LOADING  2
Your options are:
    Cancel    -- Cancel the shutdown request
    Wait      -- Wait for write operations to end and then shutdown
    Force     -- Force a shutdown
What would you like to do: [Cancel]: Force
 
   Stopping System...
   Successful stop of System
 
   Shutting Down System...
   Successful shutdown of System

logs from cpimport --all logs in attached file

 Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
Immediate system stop has been ordered. No rollback
DBRM::send_recv: controller node closed the connection
DBRM::send_recv: controller node closed the connection
DBRM::send_recv: controller node closed the connection
DBRM::send_recv: controller node closed the connection
DBRM::send_recv: controller node closed the connection
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
 
~~~
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
2019-05-21 11:39:47 (685) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4695; DBRoot-2; part-2; seg-0; newDBRoot-2; newpart-0;  a BRM Allocate extent error. [BRM error status: network error]; Error allocating extent stripe for table 4677; DBRoot: 2 [1503]
2019-05-21 11:39:47 (685) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:39:47 (685) CRIT : Bulkload Parse (thread 1) Failed for Table tpcds_1000.store_sales during parsing.  Terminating this job. [1503]
2019-05-21 11:39:47 (685) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4692; DBRoot-2; part-2; seg-0; newDBRoot-2; newpart-0;  a BRM Allocate extent error.; Previous error allocating extent stripe for table 4677; DBRoot: 2 [1503]
2019-05-21 11:39:47 (685) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:39:47 (685) CRIT : Bulkload Parse (thread 2) Failed for Table tpcds_1000.store_sales during parsing.  Terminating this job. [1503]
2019-05-21 11:39:47 (685) INFO : Bulkload Read (thread 0) Stopped reading Table tpcds_1000.store_sales.  TableInfo::readTableData(1) responding to job termination
2019-05-21 11:39:47 (685) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4689; DBRoot-2; part-2; seg-0; newDBRoot-2; newpart-0;  a BRM Allocate extent error.; Previous error allocating extent stripe for table 4677; DBRoot: 2 [1503]
2019-05-21 11:39:47 (685) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:39:47 (685) CRIT : Bulkload Parse (thread 0) Failed for Table tpcds_1000.store_sales during parsing.  Terminating this job. [1503]
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
2019-05-21 11:39:48 (685) INFO : Table tpcds_1000.store_sales (OID-4677) was not successfully loaded.  Rolling back.
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
2019-05-21 11:39:48 (685) ERR  : Error rolling back table tpcds_1000.store_sales; Bulk rollback for table tpcds_1000.store_sales (OID-4677) not performed; BRM error getting read-write state. [1523]
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
2019-05-21 11:39:48 (685) INFO : Bulk load completed, total run time : 1470.46 seconds
 
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
 
Error in loading job data
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 9 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 51 inet: 172.20.3.26 port: 8616
2019-05-21 11:40:25 (32635) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4675; DBRoot-2; part-1; seg-0; newDBRoot-2; newpart-0;  a BRM Allocate extent error. [BRM error status: network error]; Error allocating extent stripe for table 4642; DBRoot: 2 [1503]
2019-05-21 11:40:25 (32635) INFO : Bulkload Read (thread 0) Stopped reading Table tpcds_1000.catalog_sales.  TableInfo::readTableData(2) responding to job termination
2019-05-21 11:40:25 (32635) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4670; DBRoot-2; part-1; seg-0; newDBRoot-2; newpart-0;  a BRM Allocate extent error.; Previous error allocating extent stripe for table 4642; DBRoot: 2 [1503]
2019-05-21 11:40:25 (32635) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4661; DBRoot-2; part-1; seg-0; newDBRoot-2; newpart-0;  a BRM Allocate extent error.; Previous error allocating extent stripe for table 4642; DBRoot: 2 [1503]
2019-05-21 11:40:25 (32635) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:40:25 (32635) CRIT : Bulkload Parse (thread 1) Failed for Table tpcds_1000.catalog_sales during parsing.  Terminating this job. [1503]
2019-05-21 11:40:25 (32635) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:40:25 (32635) CRIT : Bulkload Parse (thread 0) Failed for Table tpcds_1000.catalog_sales during parsing.  Terminating this job. [1503]
2019-05-21 11:40:25 (32635) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:40:25 (32635) CRIT : Bulkload Parse (thread 2) Failed for Table tpcds_1000.catalog_sales during parsing.  Terminating this job. [1503]
2019-05-21 11:40:26 (32635) INFO : Table tpcds_1000.catalog_sales (OID-4642) was not successfully loaded.  Rolling back.
2019-05-21 11:40:26 (32635) ERR  : Error rolling back table tpcds_1000.catalog_sales; Bulk rollback for table tpcds_1000.catalog_sales (OID-4642) not performed; BRM is in read-only state. [1522]
2019-05-21 11:40:26 (32635) INFO : Bulk load completed, total run time : 1586.55 seconds
 
 
Error in loading job data
2019-05-21 11:40:28 (899) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4635; DBRoot-2; part-1; seg-2; newDBRoot-2; newpart-0;  a BRM Allocate extent error. [BRM error status: DBRM is in READ-ONLY mode]; Error allocating extent stripe for table 4607; DBRoot: 2 [1503]
2019-05-21 11:40:28 (899) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:40:28 (899) CRIT : Bulkload Parse (thread 2) Failed for Table tpcds_1000.web_sales during parsing.  Terminating this job. [1503]
2019-05-21 11:40:28 (899) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4632; DBRoot-2; part-1; seg-2; newDBRoot-2; newpart-0;  a BRM Allocate extent error.; Previous error allocating extent stripe for table 4607; DBRoot: 2 [1503]
2019-05-21 11:40:28 (899) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:40:28 (899) CRIT : Bulkload Parse (thread 1) Failed for Table tpcds_1000.web_sales during parsing.  Terminating this job. [1503]
2019-05-21 11:40:28 (899) CRIT : extendColumnNewExtent: error creating BRM extent after column OID-4631; DBRoot-2; part-1; seg-2; newDBRoot-2; newpart-0;  a BRM Allocate extent error.; Previous error allocating extent stripe for table 4607; DBRoot: 2 [1503]
2019-05-21 11:40:28 (899) ERR  : writeToFileExtentCheck: extend column failed:  a BRM Allocate extent error. [1503]
2019-05-21 11:40:28 (899) CRIT : Bulkload Parse (thread 0) Failed for Table tpcds_1000.web_sales during parsing.  Terminating this job. [1503]
2019-05-21 11:40:28 (899) INFO : Bulkload Read (thread 0) Stopped reading Table tpcds_1000.web_sales.  TableInfo::readTableData(1) responding to job termination
2019-05-21 11:40:28 (899) INFO : Table tpcds_1000.web_sales (OID-4607) was not successfully loaded.  Rolling back.
2019-05-21 11:40:28 (899) ERR  : Error rolling back table tpcds_1000.web_sales; Bulk rollback for table tpcds_1000.web_sales (OID-4607) not performed; BRM is in read-only state. [1522]
2019-05-21 11:40:28 (899) INFO : Bulk load completed, total run time : 1475.01 seconds
 
 
Error in loading job data

 mcsadmin start
startsystem   Tue May 21 11:39:30 2019
 
startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules
 
 
   System being started, please wait.........
 
TIMEOUT: ProcMon not responding to getSystemStatus
**** startSystem Failed : check log files
[root@pm1_1 ~]# mcsadmin getsystemi
getsysteminfo   Tue May 21 11:43:08 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        FAILED                       Tue May 21 11:40:43 2019
 
Module um1    FAILED                       Tue May 21 11:40:43 2019
Module pm1    ACTIVE                       Tue May 21 11:40:24 2019
Module pm2    ACTIVE                       Tue May 21 11:40:31 2019
 
Active Parent OAM Performance Module is 'pm1'
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
ProcessMonitor      um1       ACTIVE            Tue May 21 11:40:03 2019       26835
ServerMonitor       um1       ACTIVE            Tue May 21 11:40:22 2019       27312
DBRMWorkerNode      um1       ACTIVE            Tue May 21 11:40:23 2019       27381
ExeMgr              um1       ACTIVE            Tue May 21 11:40:34 2019       28854
DDLProc             um1       ACTIVE            Tue May 21 11:40:38 2019       28905
DMLProc             um1       FAILED            Tue May 21 11:40:59 2019       28919
mysqld              um1       ACTIVE            Tue May 21 11:40:36 2019       27265
 
ProcessMonitor      pm1       ACTIVE            Tue May 21 11:39:46 2019        2398
ProcessManager      pm1       ACTIVE            Tue May 21 11:39:53 2019        2513
DBRMControllerNode  pm1       ACTIVE            Tue May 21 11:40:17 2019        3162
ServerMonitor       pm1       ACTIVE            Tue May 21 11:40:20 2019        3181
DBRMWorkerNode      pm1       ACTIVE            Tue May 21 11:40:20 2019        3234
PrimProc            pm1       ACTIVE            Tue May 21 11:40:24 2019        3379
WriteEngineServer   pm1       ACTIVE            Tue May 21 11:40:25 2019        3436
 
ProcessMonitor      pm2       ACTIVE            Tue May 21 11:40:11 2019        2119
ProcessManager      pm2       HOT_STANDBY       Tue May 21 11:40:13 2019        2168
DBRMControllerNode  pm2       COLD_STANDBY      Tue May 21 11:40:23 2019
ServerMonitor       pm2       ACTIVE            Tue May 21 11:40:26 2019        2202
DBRMWorkerNode      pm2       ACTIVE            Tue May 21 11:40:27 2019        2237
PrimProc            pm2       ACTIVE            Tue May 21 11:40:31 2019        2254
WriteEngineServer   pm2       ACTIVE            Tue May 21 11:40:32 2019        2264
 
Active Alarm Counts: Critical = 6, Major = 1, Minor = 0, Warning = 0, Info = 0

After stopping, staring system several times (that might be used as workaround but it's not totally suitable ) rollback completed and mcs system was getting in Active State;

 mcsadmin start
startsystem   Tue May 21 12:52:07 2019
 
   System being started, please wait...
 
TIMEOUT: ProcMon not responding to getSystemStatus
**** startSystem Failed : check log files
[root@pm1_1 ~]#  mcsadmin getsystemi
getsysteminfo   Tue May 21 12:55:52 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        FAILED                       Tue May 21 12:52:24 2019
 
Module um1    FAILED                       Tue May 21 12:52:14 2019
Module pm1    ACTIVE                       Tue May 21 12:52:20 2019
Module pm2    MAN_OFFLINE                  Tue May 21 12:52:19 2019
 
Active Parent OAM Performance Module is 'pm1'
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
ProcessMonitor      um1       INITIAL
ServerMonitor       um1       INITIAL
DBRMWorkerNode      um1       INITIAL
ExeMgr              um1       INITIAL
DDLProc             um1       INITIAL
DMLProc             um1       INITIAL
mysqld              um1       INITIAL
 
ProcessMonitor      pm1       ACTIVE            Tue May 21 12:49:33 2019        1186
ProcessManager      pm1       ACTIVE            Tue May 21 12:49:39 2019        1298
DBRMControllerNode  pm1       ACTIVE            Tue May 21 12:52:14 2019        3253
ServerMonitor       pm1       ACTIVE            Tue May 21 12:52:16 2019        3281
DBRMWorkerNode      pm1       ACTIVE            Tue May 21 12:52:16 2019        3336
PrimProc            pm1       ACTIVE            Tue May 21 12:52:20 2019        3391
WriteEngineServer   pm1       ACTIVE            Tue May 21 12:52:21 2019        3415
 
ProcessMonitor      pm2       ACTIVE            Tue May 21 12:49:58 2019        1183
ProcessManager      pm2       HOT_STANDBY       Tue May 21 12:49:59 2019        1257
DBRMControllerNode  pm2       INITIAL
ServerMonitor       pm2       INITIAL
DBRMWorkerNode      pm2       INITIAL
PrimProc            pm2       INITIAL
WriteEngineServer   pm2       INITIAL
 
Active Alarm Counts: Critical = 5, Major = 2, Minor = 10, Warning = 0, Info = 0
[root@pm1_1 ~]# /usr/local/mariadb/columnstore/bin/viewtablelock
 There are 10 table locks
 
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
Could not connect to PMS2: Connection refused
^C
[root@pm1_1 ~]#  mcsadmin getsystemi
getsysteminfo   Tue May 21 12:59:12 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        MAN_OFFLINE
 
 
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
 
**** printProcessStatus Failed =  API Failure return in getProcessStatus API
[root@pm1_1 ~]#  mcsadmin getsystemi
getsysteminfo   Tue May 21 12:59:27 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        MAN_OFFLINE
 
 
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
 
**** printProcessStatus Failed =  API Failure return in getProcessStatus API
[root@pm1_1 ~]#  mcsadmin getsystemi
getsysteminfo   Tue May 21 13:00:08 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        MAN_OFFLINE
 
 
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
 
**** printProcessStatus Failed =  API Failure return in getProcessStatus API
[root@pm1_1 ~]#  mcsadmin getsystemi
getsysteminfo   Tue May 21 13:00:17 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        MAN_INIT
 
 
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
 
**** printProcessStatus Failed =  API Failure return in getProcessStatus API
[root@pm1_1 ~]#  mcsadmin start
startsystem   Tue May 21 13:00:24 2019
 
startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules
 
 
   System being started, please wait..........
 
   System Not Ready, DMLProc is checking/processing rollback of abandoned transactions. Processing could take some time, please wait......
   Successful start of System

*no* further _formatting_ is done here



 Comments   
Comment by Todd Stoffel (Inactive) [ 2023-03-06 ]

This ticket was created prior to convergence with the server and may be obsolete. If you find this issue still exists in a modern version, please open a new ticket.

Generated at Thu Feb 08 02:41:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.