[MCOL-1140] postConfigure fails to run on a system without a pm1 Created: 2018-01-05  Updated: 2023-10-26  Resolved: 2020-04-15

Status: Closed
Project: MariaDB ColumnStore
Component/s: ?
Affects Version/s: 1.1.2
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: David Hill (Inactive) Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: community
Environment:

non-root amazon ami 3 pm combo system with EBS



 Description   

in testing failover based on a user scenario, ran into a situation where the system become unusable.

First got the the problem in MCOL-1139 where had a defective PM1 Nd it was removed from the system. This left the storage in a bad state.

Then from pm3, did a shutdown and planned to rerun postConfigure to fix. But then I got the below error message..

so either need to document this issue that a pm1 server needs to be added back into the system or we change the logic where we require postConfigure from running on pm1.

getst
getstorageconfig Fri Jan 5 16:40:54 2018

System Storage Configuration

Performance Module (DBRoot) Storage Type = external
User Module Storage Type = internal
System Assigned DBRoot Count = 3
DBRoot IDs assigned to 'pm2' = 2

DBRoot IDs unassigned = 1, 3

Amazon EC2 Volume Name/Device Name/Amazon Device Name for DBRoot2: vol-0f2fa799bd8450525, /dev/sdh, /dev/xvdh
Amazon EC2 Volume Name/Device Name/Amazon Device Name for DBRoot1: vol-072d7d6b55ee398e9, /dev/sdg, /dev/xvdg
Amazon EC2 Volume Name/Device Name/Amazon Device Name for DBRoot3: vol-0a1492a16c1d4404e, /dev/sdi, /dev/xvdi

mcsadmin> shutd y
shutdownsystem Fri Jan 5 16:42:36 2018

This command stops the processing of applications on all Modules within the MariaDB ColumnStore System

Checking for active transactions

Stopping System...
Successful stop of System

Shutting Down System...
Successful shutdown of System

mcsadmin> exit
exit Fri Jan 5 16:43:23 2018
Exiting the MariaDB ColumnStore Admin Console
[mariadb-user@ip-172-30-0-129 ~]$ module
pm3
[mariadb-user@ip-172-30-0-129 ~]$
[mariadb-user@ip-172-30-0-129 ~]$
[mariadb-user@ip-172-30-0-129 ~]$ /home/mariadb-user/mariadb/columnstore/bin/postConfigure -i /home/mariadb-user/mariadb/columnstore

This is the MariaDB ColumnStore System Configuration and Installation tool.
It will Configure the MariaDB ColumnStore System and will perform a Package
Installation of all of the Servers within the System that is being configured.

IMPORTANT: This tool should only be run on the Parent OAM Module
which is a Performance Module, preferred Module #1

Prompting instructions:

Press 'enter' to accept a value in (), if available or
Enter one of the options within [], if available, or
Enter a new value

A copy of the MariaDB ColumnStore Configuration file has been saved during Package install.
It's Configured for a Multi-Server Install.
You have an option of utilizing the configuration data from that file or starting
with the MariaDB ColumnStore Configuration File that comes with the MariaDB ColumnStore Package.
You will only want to utilize the old configuration data when performing the same
type of install, i.e. Single or Multi-Server

Do you want to utilize the configuration data from the saved copy? [y,n] > n

===== Setup System Server Type Configuration =====

There are 2 options when configuring the System Server Type: single and multi

'single' - Single-Server install is used when there will only be 1 server configured
on the system. It can also be used for production systems, if the plan is
to stay single-server.

'multi' - Multi-Server install is used when you want to configure multiple servers now or
in the future. With Multi-Server install, you can still configure just 1 server
now and add on addition servers/modules in the future.

Select the type of System Server install [1=single, 2=multi] (2) >

===== Setup System Module Type Configuration =====

There are 2 options when configuring the System Module Type: separate and combined

'separate' - User and Performance functionality on separate servers.

'combined' - User and Performance functionality on the same server

Select the type of System Module Install [1=separate, 2=combined] (2) >

Combined Server Installation will be performed.
The Server will be configured as a Performance Module.
All MariaDB ColumnStore Processes will run on the Performance Modules.

NOTE: The MariaDB ColumnStore Schema Sync feature will replicate all of the
schemas and InnoDB tables across the User Module nodes. This feature can be enabled
or disabled, for example, if you wish to configure your own replication post installation.

MariaDB ColumnStore Schema Sync feature is Enabled, do you want to leave enabled? [y,n] >

NOTE: Configured to have ColumnStore use the Amazon AWS CLI Tools

NOTE: MariaDB ColumnStore Replication Feature is enabled

Enter System Name (1.1.2) >

ERROR: exiting, postConfigure can only run executed on pm1, current module is pm3
[mariadb-user@ip-172-30-0-129 ~]$



 Comments   
Comment by David Hill (Inactive) [ 2018-01-09 ]

reproduced issue once I got a wortked 6.9 build machine

start postConfigure and it hang waiting for system to startup, procmgr wasnt getting laucnhed

mcsadmin> getsystemi
getsysteminfo Tue Jan 9 20:34:29 2018

System columnstore-1

System and Module statuses

Component Status Last Status Change
------------ -------------------------- ------------------------
System DOWN

Module pm1 INITIAL

MariaDB ColumnStore Process statuses

Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor pm1 ACTIVE Tue Jan 9 20:33:17 2018 15766
ProcessManager pm1 INITIAL
DBRMControllerNode pm1 INITIAL
ServerMonitor pm1 INITIAL
DBRMWorkerNode pm1 INITIAL
DecomSvr pm1 INITIAL
PrimProc pm1 INITIAL
ExeMgr pm1 INITIAL
WriteEngineServer pm1 INITIAL
DDLProc pm1 INITIAL
DMLProc pm1 INITIAL
mysqld pm1 INITIAL

Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0
mcsadmin>

Jan 9 20:33:00 ip-172-30-0-167 ProcessMonitor[15766]: 00.397406 |0|0|0| D 18 CAL0000:
Jan 9 20:33:00 ip-172-30-0-167 ProcessMonitor[15766]: 00.397421 |0|0|0| D 18 CAL0000: *********Process Monitor Started*********
Jan 9 20:33:00 ip-172-30-0-167 ProcessMonitor[15766]: 00.399774 |0|0|0| D 18 CAL0000: ProcMon: Starting as ACTIVE Parent
Jan 9 20:33:00 ip-172-30-0-167 ProcessMonitor[15766]: 00.399886 |0|0|0| D 18 CAL0000: createDataDirs called
Jan 9 20:33:00 ip-172-30-0-167 ProcessMonitor[15766]: 00.400007 |0|0|0| D 18 CAL0000: Message Thread started ..
Jan 9 20:33:00 ip-172-30-0-167 ProcessMonitor[15766]: 00.412808 |0|0|0| D 18 CAL0000: checkDataMount called
Jan 9 20:33:00 ip-172-30-0-167 ProcessMonitor[15766]: 00.417275 |0|0|0| D 18 CAL0000: statusControlThread Thread started ..
Jan 9 20:33:06 ip-172-30-0-167 ProcessMonitor[15766]: 06.417956 |0|0|0| D 18 CAL0000: StatusUpdate of Process ProcessMonitor State = 1 PID = 15766
Jan 9 20:33:06 ip-172-30-0-167 ProcessMonitor[15766]: 06.420441 |0|0|0| D 18 CAL0000: mysqld Monitoring Thread started ..
Jan 9 20:33:10 ip-172-30-0-167 ProcessMonitor[15766]: 10.757227 |0|0|0| D 18 CAL0000: Process Status shared Memory allocated and Initialized
Jan 9 20:33:10 ip-172-30-0-167 ProcessMonitor[15766]: 10.757444 |0|0|0| D 18 CAL0000: System/Module Status shared Memory allociated and Initialized
Jan 9 20:33:10 ip-172-30-0-167 ProcessMonitor[15766]: 10.757807 |0|0|0| D 18 CAL0000: NIC Status shared Memory allociated and Initialized
Jan 9 20:33:10 ip-172-30-0-167 ProcessMonitor[15766]: 10.758994 |0|0|0| D 18 CAL0000: Ext Device Status shared Memory allociated and Initialized
Jan 9 20:33:10 ip-172-30-0-167 ProcessMonitor[15766]: 10.760420 |0|0|0| D 18 CAL0000: Dbroot Status shared Memory allociated and Initialized
Jan 9 20:33:10 ip-172-30-0-167 ProcessMonitor[15766]: 10.776149 |0|0|0| D 18 CAL0000: statusControlThread Thread reading ProcStatusControl port
Jan 9 20:33:17 ip-172-30-0-167 ProcessMonitor[15766]: 17.260118 |0|0|0| D 18 CAL0000: Child Process Monitoring Thread started ..
Jan 9 20:33:17 ip-172-30-0-167 ProcessMonitor[15766]: 17.261141 |0|0|0| D 18 CAL0000: statusControl: REQUEST RECEIVED: Set Process pm1/ProcessMonitor State = ACTIVE
Jan 9 20:33:17 ip-172-30-0-167 ProcessMonitor[15766]: 17.261244 |0|0|0| D 18 CAL0000: statusControl: Set Process pm1/ProcessMonitor State = ACTIVE PID = 15766
Jan 9 20:33:17 ip-172-30-0-167 ProcessMonitor[15766]: 17.261610 |0|0|0| D 18 CAL0000: processInitComplete Successfully Called

Comment by David Hill (Inactive) [ 2018-01-25 ]

a number of issue were found in oam code when a pm1 didnt exist, so this needs to be pushed to later release

Comment by David Hill (Inactive) [ 2018-01-29 ]

Look at addressing this issue by changing a working pm to pm1 went pm1 goes bad..

Comment by Todd Stoffel (Inactive) [ 2020-04-15 ]

OAM is being deprecated and replaced by an enhanced API and the MaxScale orchestration project.

Generated at Thu Feb 08 02:26:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.