Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
1.4.2
-
None
-
None
-
centos 7 3pm with gluster
-
2020-4, 2020-5, 2020-6, 2020-7
Description
Reported by customer and reproduce by support (dh).
Did successfully install of a 3pm combo system on centos 7 with gluster having 3 copies configuration.
I reproduced these 2 problems, which the customer reported
1. PM3 failover testing - when PM3 recovered, the dbroot assignment failed to move dbroot 3 back to pm3
2. tried the manual move and it also failed
first took pm3 down to see if that works
After about 3-4 minutes, it made it to a good state
Component Status Last Status Change
------------ -------------------------- ------------------------
System ACTIVE Thu Feb 27 22:47:37 2020
Module pm1 ACTIVE Thu Feb 27 22:47:34 2020
Module pm2 ACTIVE Thu Feb 27 22:47:35 2020
Module pm3 AUTO_DISABLED/DEGRADED Thu Feb 27 22:44:46 2020
Active Parent OAM Performance Module is 'pm1'
Primary Front-End MariaDB ColumnStore Module is 'pm2'
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore Process statuses
Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor pm1 ACTIVE Thu Feb 27 22:39:22 2020 16519
ProcessManager pm1 ACTIVE Thu Feb 27 22:39:28 2020 16860
DBRMControllerNode pm1 ACTIVE Thu Feb 27 22:46:45 2020 29783
ServerMonitor pm1 ACTIVE Thu Feb 27 22:40:51 2020 19012
DBRMWorkerNode pm1 ACTIVE Thu Feb 27 22:46:47 2020 29858
PrimProc pm1 ACTIVE Thu Feb 27 22:46:55 2020 29986
ExeMgr pm1 ACTIVE Thu Feb 27 22:47:20 2020 30313
WriteEngineServer pm1 ACTIVE Thu Feb 27 22:47:08 2020 30148
DDLProc pm1 ACTIVE Thu Feb 27 22:47:28 2020 30476
DMLProc pm1 ACTIVE Thu Feb 27 22:47:37 2020 30561
mysqld pm1 ACTIVE Thu Feb 27 22:47:34 2020 30758
ProcessMonitor pm2 ACTIVE Thu Feb 27 22:40:28 2020 16217
ProcessManager pm2 HOT_STANDBY Thu Feb 27 22:42:00 2020 17466
DBRMControllerNode pm2 COLD_STANDBY Thu Feb 27 22:46:44 2020
ServerMonitor pm2 ACTIVE Thu Feb 27 22:40:55 2020 16854
DBRMWorkerNode pm2 ACTIVE Thu Feb 27 22:46:51 2020 19436
PrimProc pm2 ACTIVE Thu Feb 27 22:46:59 2020 19471
ExeMgr pm2 ACTIVE Thu Feb 27 22:47:24 2020 19634
WriteEngineServer pm2 ACTIVE Thu Feb 27 22:47:12 2020 19574
DDLProc pm2 COLD_STANDBY Thu Feb 27 22:47:30 2020
DMLProc pm2 COLD_STANDBY Thu Feb 27 22:47:32 2020
mysqld pm2 ACTIVE Thu Feb 27 22:47:35 2020 19849
ProcessMonitor pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
ProcessManager pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
DBRMControllerNode pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
ServerMonitor pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
DBRMWorkerNode pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
PrimProc pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
ExeMgr pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
WriteEngineServer pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
DDLProc pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
DMLProc pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
mysqld pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
Active Alarm Counts: Critical = 7, Major = 1, Minor = 0, Warning = 0, Info = 0
mcsadmin> getst
getstorageconfig Thu Feb 27 22:48:40 2020
System Storage Configuration
Performance Module (DBRoot) Storage Type = DataRedundancy
System Assigned DBRoot Count = 3
DBRoot IDs assigned to 'pm1' = 1, 3
DBRoot IDs assigned to 'pm2' = 2
DBRoot IDs assigned to 'pm3' =
Data Redundant Configuration
Copies Per DBroot = 3
DBRoot #1 has copies on PMs = 1 2 3
DBRoot #2 has copies on PMs = 1 2 3
DBRoot #3 has copies on PMs = 1 2 3
Brought pm3 back up and it failed to bring PM3 back in the system
mcsadmin> getsystemi
getsysteminfo Thu Feb 27 22:54:59 2020
System columnstore-1
System and Module statuses
Component Status Last Status Change
------------ -------------------------- ------------------------
System DEGRADED Thu Feb 27 22:54:40 2020
Module pm1 ACTIVE Thu Feb 27 22:54:32 2020
Module pm2 ACTIVE Thu Feb 27 22:54:33 2020
Module pm3 MAN_DISABLED Thu Feb 27 22:53:26 2020
Active Parent OAM Performance Module is 'pm1'
Primary Front-End MariaDB ColumnStore Module is 'pm2'
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore Process statuses
Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor pm1 ACTIVE Thu Feb 27 22:39:22 2020 16519
ProcessManager pm1 ACTIVE Thu Feb 27 22:39:28 2020 16860
DBRMControllerNode pm1 ACTIVE Thu Feb 27 22:53:40 2020 3918
ServerMonitor pm1 ACTIVE Thu Feb 27 22:40:51 2020 19012
DBRMWorkerNode pm1 ACTIVE Thu Feb 27 22:53:42 2020 3971
PrimProc pm1 ACTIVE Thu Feb 27 22:53:50 2020 4117
ExeMgr pm1 ACTIVE Thu Feb 27 22:54:15 2020 4464
WriteEngineServer pm1 ACTIVE Thu Feb 27 22:54:04 2020 4302
DDLProc pm1 ACTIVE Thu Feb 27 22:54:23 2020 4623
DMLProc pm1 ACTIVE Thu Feb 27 22:54:29 2020 4721
mysqld pm1 ACTIVE Thu Feb 27 22:54:29 2020 4942
ProcessMonitor pm2 ACTIVE Thu Feb 27 22:40:28 2020 16217
ProcessManager pm2 HOT_STANDBY Thu Feb 27 22:42:00 2020 17466
DBRMControllerNode pm2 COLD_STANDBY Thu Feb 27 22:53:39 2020
ServerMonitor pm2 ACTIVE Thu Feb 27 22:40:55 2020 16854
DBRMWorkerNode pm2 ACTIVE Thu Feb 27 22:53:46 2020 21398
PrimProc pm2 ACTIVE Thu Feb 27 22:53:54 2020 21451
ExeMgr pm2 ACTIVE Thu Feb 27 22:54:19 2020 21588
WriteEngineServer pm2 ACTIVE Thu Feb 27 22:54:08 2020 21529
DDLProc pm2 COLD_STANDBY Thu Feb 27 22:54:25 2020
DMLProc pm2 COLD_STANDBY Thu Feb 27 22:54:27 2020
mysqld pm2 ACTIVE Thu Feb 27 22:54:30 2020 21817
ProcessMonitor pm3 ACTIVE Thu Feb 27 22:51:38 2020 4492
ProcessManager pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
DBRMControllerNode pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
ServerMonitor pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
DBRMWorkerNode pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
PrimProc pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
ExeMgr pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
WriteEngineServer pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
DDLProc pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
DMLProc pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
mysqld pm3 AUTO_OFFLINE Thu Feb 27 22:44:46 2020
Active Alarm Counts: Critical = 7, Major = 0, Minor = 0, Warning = 0, Info = 0
mcsadmin> getst
getstorageconfig Thu Feb 27 22:55:07 2020
System Storage Configuration
Performance Module (DBRoot) Storage Type = DataRedundancy
System Assigned DBRoot Count = 3
DBRoot IDs assigned to 'pm1' = 1, 3
DBRoot IDs assigned to 'pm2' = 2
DBRoot IDs assigned to 'pm3' =
Data Redundant Configuration
Copies Per DBroot = 3
DBRoot #1 has copies on PMs = 1 2 3
DBRoot #2 has copies on PMs = 1 2 3
DBRoot #3 has copies on PMs = 1 2 3
Then I tried the enable and move and got the same error you did
mcsadmin> stopsystem
stopsystem Thu Feb 27 22:56:09 2020
This command stops the processing of applications on all Modules within the MariaDB ColumnStore System
Checking for active transactions
Do you want to proceed: (y or n) [n]: y
System being stopped now...
Successful stop of System
NOTE: These module(s) are DISABLED: pm3
mcsadmin> altersystem-enablemodule pm3
altersystem-enablemodule Thu Feb 27 22:56:40 2020
This command starts the processing of applications on a Module within the MariaDB ColumnStore System
Do you want to proceed: (y or n) [n]: y
Enabling Modules
Successful enable of Modules
Performance Module(s) Enabled, run movePmDbrootConfig or assignDbrootPmConfig to assign dbroots, if needed
mcsadmin> movePmDbrootConfig pm1 1 pm3
movepmdbrootconfig Thu Feb 27 22:57:02 2020
-
-
-
- movePmDbrootConfig Failed : Can't move dbroot #1
-
-
mcsadmin> movePmDbrootConfig pm1 3 pm3
movepmdbrootconfig Thu Feb 27 22:57:06 2020
DBRoot IDs currently assigned to 'pm1' = 1, 3
DBRoot IDs currently assigned to 'pm3' =
DBroot IDs being moved, please wait...
-
-
-
- glusterctl API exception: API Failure return in GLUSTER_ASSIGN API
FAILURE: Error assigning gluster dbroot# 3 to pm3
- glusterctl API exception: API Failure return in GLUSTER_ASSIGN API
-
-
-
-
-
- glusterctl API exception: API Failure return in GLUSTER_ASSIGN API
FAILURE: Error reassigning gluster dbroot# 3 to pm1
- glusterctl API exception: API Failure return in GLUSTER_ASSIGN API
-
-
-
-
-
- manualMovePmDbroot Failed : API Failure
mcsadmin>
- manualMovePmDbroot Failed : API Failure
-
-