Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
1.1.5
-
None
-
2018-15, 2018-16, 2018-17, 2018-18, 2018-19, 2018-20, 2018-21, 2019-01
Description
Example scenario similar to customer observed issue:
4PM / 1UM system with data redundancy
PM2 fails and dbroot2 is moved to PM3
mcsadmin getstorageconfig
getstorageconfig Fri Jul 27 09:31:33 2018
System Storage Configuration
Performance Module (DBRoot) Storage Type = DataRedundancy
System Assigned DBRoot Count = 4
DBRoot IDs assigned to 'pm1' = 1
DBRoot IDs assigned to 'pm2' =
DBRoot IDs assigned to 'pm3' = 2, 3
DBRoot IDs assigned to 'pm4' = 4
PM2 reconnects to PM1 and during the failover recovery of dbroot2 from PM3 to PM2 the mount command on gluster/dbroot2 to data2 on PM2 fails in some way.
This will leave dbroot2 not mounted to PM3 or PM2 when dbrm attempts a reload/resume. While system expects it to still be mounted on PM3
Fix needs to modify procedure to ensure dbroot is always left mounted somewhere on failure in the recovery process.
simple way to reproduce behavior is to disconnect PM2 and break the file permissions on the glusterfs mount to data2 so that it will fail on recovery.