Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-1138

pm1 failover testing - didnt leave a HOT_STANDBY ProcMgr on remainng node

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 1.1.2
    • 1.1.3
    • ?
    • None
    • non-root amazon ami with EBS 3pm combo system
    • 2018-02

    Description

      started with pm1 as the active/master node after install. stopped pm1 instance, PM3 took over as master, but PM2 ProcMgr didnt go HOT_STANDBY

      [mariadb-user@ip-172-30-0-204 ~]$ ma getsystemi
      getsysteminfo Fri Jan 5 15:47:17 2018

      System 1.1.2

      System and Module statuses

      Component Status Last Status Change
      ------------ -------------------------- ------------------------
      System ACTIVE Fri Jan 5 15:43:52 2018

      Module pm1 ACTIVE Fri Jan 5 15:43:48 2018
      Module pm2 ACTIVE Fri Jan 5 15:43:44 2018
      Module pm3 ACTIVE Fri Jan 5 15:43:43 2018

      Active Parent OAM Performance Module is 'pm1'
      Primary Front-End MariaDB ColumnStore Module is 'pm1'
      MariaDB ColumnStore Replication Feature is enabled

      MariaDB ColumnStore Process statuses

      Process Module Status Last Status Change Process ID
      ------------------ ------ --------------- ------------------------ ----------
      ProcessMonitor pm1 ACTIVE Fri Jan 5 15:42:23 2018 1283
      ProcessManager pm1 ACTIVE Fri Jan 5 15:42:29 2018 1440
      DBRMControllerNode pm1 ACTIVE Fri Jan 5 15:43:18 2018 2897
      ServerMonitor pm1 ACTIVE Fri Jan 5 15:43:20 2018 2956
      DBRMWorkerNode pm1 ACTIVE Fri Jan 5 15:43:20 2018 2996
      DecomSvr pm1 ACTIVE Fri Jan 5 15:43:24 2018 3159
      PrimProc pm1 ACTIVE Fri Jan 5 15:43:27 2018 3262
      ExeMgr pm1 ACTIVE Fri Jan 5 15:43:37 2018 5003
      WriteEngineServer pm1 ACTIVE Fri Jan 5 15:43:41 2018 5143
      DDLProc pm1 ACTIVE Fri Jan 5 15:43:45 2018 5333
      DMLProc pm1 ACTIVE Fri Jan 5 15:43:49 2018 5494
      mysqld pm1 ACTIVE Fri Jan 5 15:43:41 2018 2696

      ProcessMonitor pm2 ACTIVE Fri Jan 5 15:43:07 2018 15334
      ProcessManager pm2 COLD_STANDBY Fri Jan 5 15:43:36 2018
      DBRMControllerNode pm2 COLD_STANDBY Fri Jan 5 15:43:36 2018
      ServerMonitor pm2 ACTIVE Fri Jan 5 15:43:22 2018 15820
      DBRMWorkerNode pm2 ACTIVE Fri Jan 5 15:43:23 2018 15846
      DecomSvr pm2 ACTIVE Fri Jan 5 15:43:26 2018 15877
      PrimProc pm2 ACTIVE Fri Jan 5 15:43:30 2018 15885
      ExeMgr pm2 ACTIVE Fri Jan 5 15:43:39 2018 16794
      WriteEngineServer pm2 ACTIVE Fri Jan 5 15:43:43 2018 16815
      DDLProc pm2 COLD_STANDBY Fri Jan 5 15:43:44 2018
      DMLProc pm2 COLD_STANDBY Fri Jan 5 15:43:44 2018
      mysqld pm2 ACTIVE Fri Jan 5 15:43:45 2018 15694

      ProcessMonitor pm3 ACTIVE Fri Jan 5 15:43:08 2018 14322
      ProcessManager pm3 HOT_STANDBY Fri Jan 5 15:43:12 2018 14457
      DBRMControllerNode pm3 COLD_STANDBY Fri Jan 5 15:43:24 2018
      ServerMonitor pm3 ACTIVE Fri Jan 5 15:43:27 2018 14823
      DBRMWorkerNode pm3 ACTIVE Fri Jan 5 15:43:28 2018 14868
      DecomSvr pm3 ACTIVE Fri Jan 5 15:43:31 2018 14882
      PrimProc pm3 ACTIVE Fri Jan 5 15:43:34 2018 14890
      ExeMgr pm3 ACTIVE Fri Jan 5 15:43:39 2018 14969
      WriteEngineServer pm3 ACTIVE Fri Jan 5 15:43:43 2018 14990
      DDLProc pm3 COLD_STANDBY Fri Jan 5 15:43:43 2018
      DMLProc pm3 COLD_STANDBY Fri Jan 5 15:43:43 2018
      mysqld pm3 ACTIVE Fri Jan 5 15:43:26 2018 14698

      Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0
      [mariadb-user@ip-172-30-0-204 ~]$

      STOP PM1

      System 1.1.2

      System and Module statuses

      Component Status Last Status Change
      ------------ -------------------------- ------------------------
      System ACTIVE Thu Jan 4 21:36:54 2018

      Module pm1 AUTO_DISABLED/DEGRADED Thu Jan 4 21:35:01 2018
      Module pm2 ACTIVE Thu Jan 4 21:36:12 2018
      Module pm3 ACTIVE Thu Jan 4 21:35:38 2018

      Active Parent OAM Performance Module is 'pm3'
      Primary Front-End MariaDB ColumnStore Module is 'pm3'
      MariaDB ColumnStore Replication Feature is enabled

      MariaDB ColumnStore Process statuses

      Process Module Status Last Status Change Process ID
      ------------------ ------ --------------- ------------------------ ----------
      ProcessMonitor pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      ProcessManager pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      DBRMControllerNode pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      ServerMonitor pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      DBRMWorkerNode pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      DecomSvr pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      PrimProc pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      ExeMgr pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      WriteEngineServer pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      DDLProc pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      DMLProc pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018
      mysqld pm1 AUTO_OFFLINE Thu Jan 4 21:35:51 2018

      ProcessMonitor pm2 ACTIVE Thu Jan 4 21:19:18 2018 3458
      ProcessManager pm2 COLD_STANDBY Thu Jan 4 21:36:12 2018
      DBRMControllerNode pm2 COLD_STANDBY Thu Jan 4 21:36:12 2018
      ServerMonitor pm2 ACTIVE Thu Jan 4 21:19:33 2018 3951
      DBRMWorkerNode pm2 ACTIVE Thu Jan 4 21:19:34 2018 3963
      DecomSvr pm2 ACTIVE Thu Jan 4 21:19:37 2018 3995
      PrimProc pm2 ACTIVE Thu Jan 4 21:19:40 2018 4003
      ExeMgr pm2 ACTIVE Thu Jan 4 21:19:49 2018 4914
      WriteEngineServer pm2 ACTIVE Thu Jan 4 21:19:53 2018 4935
      DDLProc pm2 COLD_STANDBY Thu Jan 4 21:36:12 2018
      DMLProc pm2 COLD_STANDBY Thu Jan 4 21:36:12 2018
      mysqld pm2 ACTIVE Thu Jan 4 21:36:14 2018 3825

      ProcessMonitor pm3 ACTIVE Thu Jan 4 21:19:19 2018 3457
      ProcessManager pm3 ACTIVE Thu Jan 4 21:36:38 2018 3599
      DBRMControllerNode pm3 ACTIVE Thu Jan 4 21:35:15 2018 7013
      ServerMonitor pm3 ACTIVE Thu Jan 4 21:35:17 2018 7029
      DBRMWorkerNode pm3 ACTIVE Thu Jan 4 21:35:17 2018 7050
      DecomSvr pm3 ACTIVE Thu Jan 4 21:35:21 2018 7088
      PrimProc pm3 ACTIVE Thu Jan 4 21:35:23 2018 7106
      ExeMgr pm3 ACTIVE Thu Jan 4 21:35:27 2018 7177
      WriteEngineServer pm3 ACTIVE Thu Jan 4 21:35:31 2018 7209
      DDLProc pm3 ACTIVE Thu Jan 4 21:35:35 2018 7257
      DMLProc pm3 ACTIVE Thu Jan 4 21:36:54 2018 7320
      mysqld pm3 ACTIVE Thu Jan 4 21:36:26 2018 6868

      Active Alarm Counts: Critical = 3, Major = 1, Minor = 0, Warning = 0, Info = 0
      mcsadmin> getstorage

      Attachments

        Activity

          People

            dleeyh Daniel Lee (Inactive)
            hill David Hill (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.