[MCOL-988] 1um/2pm DataRep system starting up without a ProcessManager in HOT_STANDBY state Created: 2017-10-26  Updated: 2017-10-31  Resolved: 2017-10-31

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 1.1.1

Type: Bug Priority: Major
Reporter: David Hill (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

amazon ec2 1um 2pm non-root install with Data Replication selected


Sprint: 2017-21, 2017-22

 Description   

DId a build on 10/27 of 1.1.1, performed an upgraded install and there was a HOT_STANDBY ProcessManager.

mcsadmin> getsystemi
getsysteminfo Thu Oct 26 19:19:43 2017

System columnstore-1

System and Module statuses

Component Status Last Status Change
------------ -------------------------- ------------------------
System ACTIVE Thu Oct 26 19:19:09 2017

Module um1 ACTIVE Thu Oct 26 19:19:04 2017
Module pm1 ACTIVE Thu Oct 26 19:18:45 2017
Module pm2 ACTIVE Thu Oct 26 19:18:54 2017

Active Parent OAM Performance Module is 'pm1'
MariaDB ColumnStore Replication Feature is enabled

MariaDB ColumnStore Process statuses

Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor um1 ACTIVE Thu Oct 26 19:18:33 2017 16207
ServerMonitor um1 ACTIVE Thu Oct 26 19:18:45 2017 16476
DBRMWorkerNode um1 ACTIVE Thu Oct 26 19:18:46 2017 16488
ExeMgr um1 ACTIVE Thu Oct 26 19:18:58 2017 17957
DDLProc um1 ACTIVE Thu Oct 26 19:19:02 2017 17970
DMLProc um1 ACTIVE Thu Oct 26 19:19:07 2017 17980
mysqld um1 ACTIVE Thu Oct 26 19:19:26 2017

ProcessMonitor pm1 ACTIVE Thu Oct 26 19:17:58 2017 29695
ProcessManager pm1 ACTIVE Thu Oct 26 19:18:04 2017 29798
DBRMControllerNode pm1 ACTIVE Thu Oct 26 19:18:37 2017 30532
ServerMonitor pm1 ACTIVE Thu Oct 26 19:18:39 2017 30551
DBRMWorkerNode pm1 ACTIVE Thu Oct 26 19:18:39 2017 30579
DecomSvr pm1 ACTIVE Thu Oct 26 19:18:43 2017 30724
PrimProc pm1 ACTIVE Thu Oct 26 19:18:45 2017 30801
WriteEngineServer pm1 ACTIVE Thu Oct 26 19:18:46 2017 30863

ProcessMonitor pm2 ACTIVE Thu Oct 26 19:18:30 2017 20996
ProcessManager pm2 COLD_STANDBY Thu Oct 26 19:18:45 2017
DBRMControllerNode pm2 COLD_STANDBY Thu Oct 26 19:18:45 2017
ServerMonitor pm2 ACTIVE Thu Oct 26 19:18:48 2017 21147
DBRMWorkerNode pm2 ACTIVE Thu Oct 26 19:18:49 2017 21159
DecomSvr pm2 ACTIVE Thu Oct 26 19:18:52 2017 21173
PrimProc pm2 ACTIVE Thu Oct 26 19:18:55 2017 21181
WriteEngineServer pm2 ACTIVE Thu Oct 26 19:18:56 2017 21189

Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0
mcsadmin> switch
switchparentoammodule Thu Oct 26 19:20:13 2017

        • switchParentOAMModule Failed : There's no hot standby defined
          enter a Performance Module
          mcsadmin> exit


 Comments   
Comment by David Hill (Inactive) [ 2017-10-26 ]

Looks like during my upgrade install, there were some DataRed paramaters that were lost

grep DataRed Columnstore.xml.rpmsave
<DBRootStorageType>DataRedundancy</DBRootStorageType>
<DataRedundancyConfig>y</DataRedundancyConfig>
<DataRedundancyCopies>2</DataRedundancyCopies>
<DataRedundancyStorageType>unassigned</DataRedundancyStorageType>
<DataRedundancyNetworkType>1</DataRedundancyNetworkType>
<DataRedundancyConfig>
</DataRedundancyConfig>
[mariadb-user@ip-172-30-0-161 etc]$ grep DataRed Columnstore.xml
<DBRootStorageType>DataRedundancy</DBRootStorageType>
<DataRedundancyConfig>y</DataRedundancyConfig>
<DataRedundancyCopies>2</DataRedundancyCopies>
<DataRedundancyStorageType>unassigned</DataRedundancyStorageType>

So autoConfigure needs to be update with that info..

Comment by David Hill (Inactive) [ 2017-10-26 ]

upgrade issue only, works with fresh install

System and Module statuses

Component Status Last Status Change
------------ -------------------------- ------------------------
System ACTIVE Thu Oct 26 20:22:08 2017

Module um1 ACTIVE Thu Oct 26 20:22:04 2017
Module pm1 ACTIVE Thu Oct 26 20:21:45 2017
Module pm2 ACTIVE Thu Oct 26 20:21:54 2017

Active Parent OAM Performance Module is 'pm1'
MariaDB ColumnStore Replication Feature is enabled

MariaDB ColumnStore Process statuses

Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor um1 ACTIVE Thu Oct 26 20:21:30 2017 20027
ServerMonitor um1 ACTIVE Thu Oct 26 20:21:45 2017 20296
DBRMWorkerNode um1 ACTIVE Thu Oct 26 20:21:46 2017 20308
ExeMgr um1 ACTIVE Thu Oct 26 20:21:58 2017 21777
DDLProc um1 ACTIVE Thu Oct 26 20:22:02 2017 21790
DMLProc um1 ACTIVE Thu Oct 26 20:22:07 2017 21800
mysqld um1 ACTIVE Thu Oct 26 20:22:23 2017

ProcessMonitor pm1 ACTIVE Thu Oct 26 20:20:54 2017 16886
ProcessManager pm1 ACTIVE Thu Oct 26 20:21:01 2017 17037
DBRMControllerNode pm1 ACTIVE Thu Oct 26 20:21:37 2017 17856
ServerMonitor pm1 ACTIVE Thu Oct 26 20:21:39 2017 17881
DBRMWorkerNode pm1 ACTIVE Thu Oct 26 20:21:39 2017 17922
DecomSvr pm1 ACTIVE Thu Oct 26 20:21:43 2017 18049
PrimProc pm1 ACTIVE Thu Oct 26 20:21:45 2017 18127
WriteEngineServer pm1 ACTIVE Thu Oct 26 20:21:46 2017 18198

ProcessMonitor pm2 ACTIVE Thu Oct 26 20:21:26 2017 23311
ProcessManager pm2 HOT_STANDBY Thu Oct 26 20:22:08 2017 23620
DBRMControllerNode pm2 COLD_STANDBY Thu Oct 26 20:21:45 2017
ServerMonitor pm2 ACTIVE Thu Oct 26 20:21:48 2017 23531
DBRMWorkerNode pm2 ACTIVE Thu Oct 26 20:21:49 2017 23545
DecomSvr pm2 ACTIVE Thu Oct 26 20:21:52 2017 23558
PrimProc pm2 ACTIVE Thu Oct 26 20:21:55 2017 23568
WriteEngineServer pm2 ACTIVE Thu Oct 26 20:21:56 2017 23576

Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0
[mariadb-user@ip-172-30-0-161 bin]$

Comment by Daniel Lee (Inactive) [ 2017-10-31 ]

Build verified: 1.1.1-1 rpm package released to QA today

Verified with a 1um4pm glusterfs stack for both root and non-root user.

Generated at Thu Feb 08 02:25:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.