Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
1.0.6
-
None
-
None
-
AWS system
-
2016-24
Description
DecomSvr fails to start the initially after a instance is stop/started
After a number of retries, it did go ACTIVE
Issue found when MCOL-422 was fixed
Component Status Last Status Change
------------ -------------------------- ------------------------
System ACTIVE Sun Dec 4 21:15:59 2016
Module pm1 ACTIVE Sun Dec 4 21:15:56 2016
MariaDB Columnstore Process statuses
Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor pm1 ACTIVE Sun Dec 4 21:15:00 2016 1245
ProcessManager pm1 ACTIVE Sun Dec 4 21:15:06 2016 4005
DBRMControllerNode pm1 ACTIVE Sun Dec 4 21:15:14 2016 5479
ServerMonitor pm1 ACTIVE Sun Dec 4 21:15:16 2016 5870
DBRMWorkerNode pm1 ACTIVE Sun Dec 4 21:15:16 2016 6142
DecomSvr pm1 AUTO_INIT Sun Dec 4 21:17:07 2016 16905
PrimProc pm1 ACTIVE Sun Dec 4 21:15:25 2016 9215
ExeMgr pm1 ACTIVE Sun Dec 4 21:15:29 2016 9438
WriteEngineServer pm1 ACTIVE Sun Dec 4 21:15:33 2016 9518
DDLProc pm1 ACTIVE Sun Dec 4 21:15:53 2016 15795
DMLProc pm1 ACTIVE Sun Dec 4 21:15:58 2016 15903
mysqld pm1 ACTIVE Sun Dec 4 21:15:58 2016 4712
STARTING Process: DecomSvr
Dec 4 21:17:37 ip-172-30-0-34 ProcessMonitor[1245]: 37.384443 |0|0|0| D 18 CAL0000: Process location: //home/mariadb-user/mariadb/columnstore/bin/DecomSvr
Dec 4 21:17:37 ip-172-30-0-34 ProcessMonitor[1245]: 37.386610 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DecomSvr
Dec 4 21:17:37 ip-172-30-0-34 ProcessMonitor[1245]: 37.390396 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: DecomSvr*
Dec 4 21:17:37 ip-172-30-0-34 ProcessMonitor[1245]: 37.390447 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 3 PID = 0
Dec 4 21:17:37 ip-172-30-0-34 ProcessMonitor[1245]: 37.393405 |0|0|0| D 18 CAL0000: Send SET Alarm ID 13 on device DecomSvr
Dec 4 21:17:37 ip-172-30-0-34 ProcessMonitor[1245]: 37.400608 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 27 on device DBRM
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.409193 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 4 PID = 17400
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.416406 |0|0|0| E 18 CAL0000: EXCEPTION ERROR on setProcessStatus: Caught unknown exception!
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.416470 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 21 PID = 17400
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.422558 |0|0|0| E 18 CAL0000: EXCEPTION ERROR on setProcessStatus: Caught unknown exception!
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.422605 |0|0|0| D 18 CAL0000: DecomSvr PID is 17400
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.422716 |0|0|0| D 18 CAL0000: Inform Process Mgr that process was restarted: DecomSvr
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.422736 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 21 on device DecomSvr
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.423338 |0|0|0| I 18 CAL0000: Calpont Process DecomSvr restarted successfully!!
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.428955 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 13 on device DecomSvr
Dec 4 21:17:39 ip-172-30-0-34 ProcessMonitor[1245]: 39.437494 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DecomSvr
Dec 4 21:17:39 ip-172-30-0-34 ProcessManager[4005]: 39.482441 |0|0|0| I 17 CAL0000: MSG RECEIVED: Process Restarted on pm1/DecomSvr
Dec 4 21:18:04 ip-172-30-0-34 ProcessMonitor[1245]: 04.531056 |0|0|0| C 18 CAL0000: DecomSvr/17400 failed to init in 20 seconds, force killing it so it can restart
Dec 4 21:18:04 ip-172-30-0-34 ProcessMonitor[1245]: 04.531098 |0|0|0| D 18 CAL0000: Send SET Alarm ID 25 on device DecomSvr
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.562675 |0|0|0| C 18 CAL0000: *****Calpont Process Restarting: DecomSvr, old PID = 17400
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.562797 |0|0|0| D 18 CAL0000: STOPPING Process: DecomSvr
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.562851 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 1 PID = 0
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.562905 |0|0|0| D 18 CAL0000: Send SET Alarm ID 13 on device DecomSvr
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.569512 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: DecomSvr*
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.569562 |0|0|0| D 18 CAL0000: STARTING Process: DecomSvr
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.569586 |0|0|0| D 18 CAL0000: Process location: //home/mariadb-user/mariadb/columnstore/bin/DecomSvr
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.571560 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DecomSvr
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.575237 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: DecomSvr*
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.575289 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 3 PID = 0
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.577098 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DecomSvr State = AUTO_OFFLINE
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.577138 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DecomSvr State = AUTO_OFFLINE PID = 0
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.587201 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DecomSvr State = AUTO_INIT
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.587243 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DecomSvr State = AUTO_INIT PID = 0
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.590043 |0|0|0| D 18 CAL0000: Send SET Alarm ID 13 on device DecomSvr
Dec 4 21:18:09 ip-172-30-0-34 ProcessMonitor[1245]: 09.596244 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 27 on device DBRM
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.594404 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 4 PID = 18021
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.601588 |0|0|0| E 18 CAL0000: EXCEPTION ERROR on setProcessStatus: Caught unknown exception!
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.601659 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 21 PID = 18021
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.607957 |0|0|0| E 18 CAL0000: EXCEPTION ERROR on setProcessStatus: Caught unknown exception!
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.608004 |0|0|0| D 18 CAL0000: DecomSvr PID is 18021
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.608091 |0|0|0| D 18 CAL0000: Inform Process Mgr that process was restarted: DecomSvr
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.608119 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 13 on device DecomSvr
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.608767 |0|0|0| I 18 CAL0000: Calpont Process DecomSvr restarted successfully!!
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.614615 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DecomSvr
Dec 4 21:18:11 ip-172-30-0-34 ProcessMonitor[1245]: 11.621288 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 21 on device DecomSvr
Dec 4 21:18:11 ip-172-30-0-34 ProcessManager[4005]: 11.668342 |0|0|0| I 17 CAL0000: MSG RECEIVED: Process Restarted on pm1/DecomSvr
Dec 4 21:18:36 ip-172-30-0-34 ProcessMonitor[1245]: 36.716680 |0|0|0| C 18 CAL0000: DecomSvr/18021 failed to init in 20 seconds, force killing it so it can restart
Dec 4 21:18:36 ip-172-30-0-34 ProcessMonitor[1245]: 36.716722 |0|0|0| D 18 CAL0000: Send SET Alarm ID 25 on device DecomSvr
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.747118 |0|0|0| C 18 CAL0000: *****Calpont Process Restarting: DecomSvr, old PID = 18021
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.747227 |0|0|0| D 18 CAL0000: STOPPING Process: DecomSvr
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.747272 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 1 PID = 0
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.747315 |0|0|0| D 18 CAL0000: Send SET Alarm ID 13 on device DecomSvr
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.754309 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: DecomSvr*
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.754361 |0|0|0| D 18 CAL0000: STARTING Process: DecomSvr
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.754400 |0|0|0| D 18 CAL0000: Process location: //home/mariadb-user/mariadb/columnstore/bin/DecomSvr
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.756584 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DecomSvr
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.760057 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: DecomSvr*
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.760108 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 3 PID = 0
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.762082 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DecomSvr State = AUTO_OFFLINE
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.762127 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DecomSvr State = AUTO_OFFLINE PID = 0
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.763604 |0|0|0| D 18 CAL0000: Send SET Alarm ID 13 on device DecomSvr
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.771972 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DecomSvr State = AUTO_INIT
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.772017 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DecomSvr State = AUTO_INIT PID = 0
Dec 4 21:18:41 ip-172-30-0-34 ProcessMonitor[1245]: 41.780036 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 27 on device DBRM
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.779175 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 4 PID = 18691
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.792351 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DecomSvr State = ACTIVE
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.792429 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DecomSvr State = ACTIVE PID = 18691
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.792446 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 21 PID = 18691
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.804486 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DecomSvr State = PID_UPDATE
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.804553 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DecomSvr State = PID_UPDATE PID = 18691
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.804569 |0|0|0| D 18 CAL0000: DecomSvr PID is 18691
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.804665 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 21 on device DecomSvr
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.804668 |0|0|0| D 18 CAL0000: Inform Process Mgr that process was restarted: DecomSvr
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.805249 |0|0|0| I 18 CAL0000: Calpont Process DecomSvr restarted successfully!!
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.811381 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 13 on device DecomSvr
Dec 4 21:18:43 ip-172-30-0-34 ProcessMonitor[1245]: 43.820111 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DecomSvr
Dec 4 21:18:43 ip-172-30-0-34 ProcessManager[4005]: 43.862092 |0|0|0| I 17 CAL0000: MSG RECEIVED: Process Restarted on pm1/DecomSvr
[root@ip-172-30-0-34 columnstore]#