Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Cannot Reproduce
-
1.0.7
-
None
-
CentOS 6.5 mariadb-columnstore-1.0.7-1-x86_64
Description
After server relocation,We can't restart system,and get error log as below:
mcsadmin> restartsystem
restartsystem Fri Dec 22 09:22:14 2017
Do you want to proceed: (y or n) [n]: y
System being restarted now ...
-
-
-
- restartSystem Failed : API Failure return in restartSystem API
-
-
----------------
[root@HLServer110 columnstore]# cat err.log
Dec 22 09:22:23 HLServer110 ProcessManager[3268]: 23.450253 |0|0|0| E 17 CAL0000: line: 6255 Error running DBRM save_brm
----------------
[root@HLServer110 columnstore]# cat warning.log
Dec 22 09:22:19 HLServer110 ProcessManager[3268]: 19.033411 |0|0|0| W 17 CAL0000: pm2 module failed to stop!!
----------------
[root@HLServer110 columnstore]# cat info.log
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.015530 |0|0|0| I 17 CAL0000: MSG RECEIVED: Restart System request...
Dec 22 09:22:18 HLServer110 ProcessMonitor[3177]: 18.023963 |0|0|0| I 18 CAL0000: MSG RECEIVED: Stop All process request...
Dec 22 09:22:18 HLServer110 ProcessMonitor[3177]: 18.024402 |0|0|0| I 18 CAL0000: STOPALL: ACK back to ProcMgr, STATUS_UPDATE only performed
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.034577 |0|0|0| I 17 CAL0000: RESTARTSYSTEM: Request Stop Module on pm2
Dec 22 09:22:19 HLServer110 ProcessManager[3268]: 19.033531 |0|0|0| I 17 CAL0000: RESTARTSYSTEM: ACK received from Process-Monitor, return status = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.034908 |0|0|0| I 18 CAL0000: MSG RECEIVED: Stop All process request...
Dec 22 09:22:21 HLServer110 ProcessMonitor[3177]: 21.361166 |0|0|0| I 18 CAL0000: STOPALL: ACK back to ProcMgr, return status = 0
Dec 22 09:22:24 HLServer110 ProcessManager[3268]: 24.343899 |0|0|0| I 17 CAL0000: RESTARTSYSTEM: ACK received from Process-Monitor for stopModule requests, return status = 1
Dec 22 09:22:24 HLServer110 ProcessManager[3268]: 24.344078 |0|0|0| I 17 CAL0000: RESTARTSYSTEM: ACK back to sender
Dec 22 09:22:24 HLServer110 ProcessManager[3268]: 24.344171 |0|0|0| I 17 CAL0000: Restart System Completed, status = 1
----------------
[root@HLServer110 columnstore]# tail -n 100 debug.log
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.016406 |0|0|0| D 17 CAL0000: Set System State = MAN_OFFLINE
Dec 22 09:22:18 HLServer110 ProcessMonitor[3177]: 18.017023 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set System State = MAN_OFFLINE
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.022752 |0|0|0| D 17 CAL0000: pm1 module is stopped by request.
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.022852 |0|0|0| D 17 CAL0000: Set Module pm1 State = 2
Dec 22 09:22:18 HLServer110 ProcessMonitor[3177]: 18.023380 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Module pm1 State = MAN_INIT
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.023511 |0|0|0| D 17 CAL0000: sendMsgProcMon: Process module pm1
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.026061 |0|0|0| D 17 CAL0000: pm2 module is stopped by request.
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.026153 |0|0|0| D 17 CAL0000: Set Module pm2 State = 2
Dec 22 09:22:18 HLServer110 ProcessMonitor[3177]: 18.026692 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Module pm2 State = MAN_INIT
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.032138 |0|0|0| D 17 CAL0000: sendMsgProcMon: Process module pm2
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.034787 |0|0|0| D 17 CAL0000: pm2 module is stopped by request.
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.034865 |0|0|0| D 17 CAL0000: Set Module pm2 State = 2
Dec 22 09:22:18 HLServer110 ProcessMonitor[3177]: 18.035424 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Module pm2 State = MAN_INIT
Dec 22 09:22:18 HLServer110 ProcessManager[3268]: 18.040640 |0|0|0| D 17 CAL0000: sendMsgProcMon: Process module pm2
Dec 22 09:22:19 HLServer110 ProcessManager[3268]: 19.033708 |0|0|0| D 17 CAL0000: pm1 module is stopped by request.
Dec 22 09:22:19 HLServer110 ProcessManager[3268]: 19.033782 |0|0|0| D 17 CAL0000: Set Module pm1 State = 2
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.034357 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Module pm1 State = MAN_INIT
Dec 22 09:22:19 HLServer110 ProcessManager[3268]: 19.034482 |0|0|0| D 17 CAL0000: sendMsgProcMon: Process module pm1
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.036065 |0|0|0| D 18 CAL0000: STOPPING Process: DMLProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.036214 |0|0|0| D 18 CAL0000: StatusUpdate of Process DMLProc State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.036310 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DMLProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.038223 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DMLProc State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.038370 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DMLProc State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.041100 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device DMLProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.057360 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: DMLProc*
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.057490 |0|0|0| D 18 CAL0000: STOPPING Process: DDLProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.057618 |0|0|0| D 18 CAL0000: StatusUpdate of Process DDLProc State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.057712 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DDLProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.059941 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DDLProc State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.060084 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DDLProc State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.061109 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device DDLProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.079658 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: DDLProc*
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.079785 |0|0|0| D 18 CAL0000: STOPPING Process: WriteEngineServer
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.079902 |0|0|0| D 18 CAL0000: StatusUpdate of Process WriteEngineServer State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.079960 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device WriteEngineServer
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.082031 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/WriteEngineServer State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.082157 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/WriteEngineServer State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.082897 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device WriteEngineServer
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.100912 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: WriteEngineServ*
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.101037 |0|0|0| D 18 CAL0000: STOPPING Process: ExeMgr
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.101147 |0|0|0| D 18 CAL0000: StatusUpdate of Process ExeMgr State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.101246 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device ExeMgr
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.102941 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/ExeMgr State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.103065 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/ExeMgr State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.104149 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device ExeMgr
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.122374 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: ExeMgr*
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.122497 |0|0|0| D 18 CAL0000: STOPPING Process: PrimProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.122606 |0|0|0| D 18 CAL0000: StatusUpdate of Process PrimProc State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.122690 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device PrimProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.124526 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/PrimProc State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.124654 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/PrimProc State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.125791 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device PrimProc
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.143851 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: PrimProc*
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.143976 |0|0|0| D 18 CAL0000: STOPPING Process: DecomSvr
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.144092 |0|0|0| D 18 CAL0000: StatusUpdate of Process DecomSvr State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.144154 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DecomSvr
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.145723 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DecomSvr State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.145879 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DecomSvr State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.147011 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device DecomSvr
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.165526 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: DecomSvr*
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.165645 |0|0|0| D 18 CAL0000: STOPPING Process: DBRMWorkerNode
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.165756 |0|0|0| D 18 CAL0000: StatusUpdate of Process DBRMWorkerNode State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.165849 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DBRMWorkerNode
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.167865 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DBRMWorkerNode State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.167979 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DBRMWorkerNode State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.169042 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device DBRMWorkerNode
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.187243 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: workernode*
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.187363 |0|0|0| D 18 CAL0000: STOPPING Process: ServerMonitor
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.187475 |0|0|0| D 18 CAL0000: StatusUpdate of Process ServerMonitor State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.187559 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device ServerMonitor
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.189784 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/ServerMonitor State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.189941 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/ServerMonitor State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.190853 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device ServerMonitor
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.208791 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: ServerMonitor*
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.208913 |0|0|0| D 18 CAL0000: STOPPING Process: DBRMControllerNode
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.209031 |0|0|0| D 18 CAL0000: StatusUpdate of Process DBRMControllerNode State = 0 PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.209116 |0|0|0| D 18 CAL0000: Send CLEAR Alarm ID 25 on device DBRMControllerNode
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.211139 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/DBRMControllerNode State = MAN_OFFLINE
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.211254 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/DBRMControllerNode State = MAN_OFFLINE PID = 0
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.212667 |0|0|0| D 18 CAL0000: Send SET Alarm ID 21 on device DBRMControllerNode
Dec 22 09:22:19 HLServer110 ProcessMonitor[3177]: 19.230747 |0|0|0| D 18 CAL0000: Pkill Process just to make sure: controllernode*
Dec 22 09:22:21 HLServer110 ProcessMonitor[3177]: 21.268807 |0|0|0| D 18 CAL0000: BRM reset_locks script run
Dec 22 09:22:21 HLServer110 ProcessMonitor[3177]: 21.269607 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/mysqld State = MAN_OFFLINE
Dec 22 09:22:21 HLServer110 ProcessMonitor[3177]: 21.269735 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/mysqld State = MAN_OFFLINE PID = 0
Dec 22 09:22:21 HLServer110 ProcessMonitor[3177]: 21.359965 |0|0|0| D 18 CAL0000: Stop MySQL Process
Dec 22 09:22:21 HLServer110 ProcessMonitor[3177]: 21.360778 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Module pm1 State = MAN_OFFLINE
Dec 22 09:22:21 HLServer110 ProcessMonitor[3177]: 21.360897 |0|0|0| D 18 CAL0000: unmountExtraDBroots called
Dec 22 09:22:21 HLServer110 ProcessManager[3268]: 21.361308 |0|0|0| D 17 CAL0000: pm1 module is successfully stopped.
Dec 22 09:22:21 HLServer110 ProcessManager[3268]: 21.365243 |0|0|0| D 17 CAL0000: Running reset_locks
Dec 22 09:22:23 HLServer110 ProcessManager[3268]: 23.401485 |0|0|0| D 17 CAL0000: Ran reset_locks
Dec 22 09:22:23 HLServer110 ProcessManager[3268]: 23.401677 |0|0|0| D 17 CAL0000: Running DBRM save_brm
Dec 22 09:22:24 HLServer110 ProcessManager[3268]: 24.343707 |0|0|0| D 17 CAL0000: Successfully ran DBRM clearShm
How can i recovery this question?