Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-1466 handling multi server columnstore failover
  3. MCOL-1572

About our ParentOAM failure handling issue with AmazonAMI

    XMLWordPrintable

Details

    • Sub-Task
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Won't Do
    • None
    • N/A
    • N/A
    • None

    Description

      Hi David,
      As per your suggestion we have intiated all our Amazon Instances with "MariaDB-ColumnStore-1.1.5 - ami-a0c09edf" AMI. Also we have added ext2 saperate volumes for each PM module. We have Multi Server ColumnStore System (1 UM, 3 PM). When we have configured system first time all seems fine. Then to check what will happen if we will get failure on PM1 (instence stopped) which is our parentOAM.
      We found system has moved parentOAM to other PM and PM1 become disabled but its dbroot had not moved. Also we noticed database become readonly access means It allows only SELECT operation but "CREATE TABLE, UPDATE, INSERT, DELETE" had stopped working. Why?
      Please also help me to find out answer for some queries as below.
      1 > can you please let us know what should be system behaviour when parentOAM got failure?
      2 > Can you please check attached "post-configure-steps-followed.txt" to see we have followed proper steps to configure system?
      3 > Can you also provide us details about which EBS Volume Type (gp2, io1, sc1, st1, standard) is best suitable for large amount of data we have some tables which has more than 50 Million records?
      We have attached columnstoreSupport report with this.
      Current system status is as below.
      Component Status Last Status Change
      ------------ -------------------------- ------------------------
      System ACTIVE Wed Jul 11 10:08:23 2018
      Module um1 ACTIVE Wed Jul 11 09:45:53 2018
      Module pm1 AUTO_DISABLED/DEGRADED Wed Jul 11 09:52:24 2018
      Module pm2 DEGRADED Wed Jul 11 09:58:53 2018
      Module pm3 ACTIVE Wed Jul 11 09:45:43 2018
      Active Parent OAM Performance Module is 'pm2'
      MariaDB ColumnStore Replication Feature is enabled
      MariaDB ColumnStore Process statuses
      Process Module Status Last Status Change Process ID
      ------------------ ------ --------------- ------------------------ ----------
      ProcessMonitor um1 ACTIVE Wed Jul 11 09:45:09 2018 15729
      ServerMonitor um1 ACTIVE Wed Jul 11 09:45:28 2018 16138
      DBRMWorkerNode um1 MAN_OFFLINE Wed Jul 11 09:53:18 2018
      ExeMgr um1 ACTIVE Wed Jul 11 09:54:11 2018 20972
      DDLProc um1 MAN_OFFLINE Wed Jul 11 09:54:30 2018
      DMLProc um1 MAN_OFFLINE Wed Jul 11 09:54:42 2018
      mysqld um1 ACTIVE Wed Jul 11 09:54:18 2018 21247
      ProcessMonitor pm1 AUTO_OFFLINE Wed Jul 11 09:52:34 2018
      ProcessManager pm1 AUTO_OFFLINE Wed Jul 11 09:52:34 2018
      DBRMControllerNode pm1 AUTO_OFFLINE Wed Jul 11 09:52:34 2018
      ServerMonitor pm1 AUTO_OFFLINE Wed Jul 11 09:52:34 2018
      DBRMWorkerNode pm1 AUTO_OFFLINE Wed Jul 11 09:52:34 2018
      DecomSvr pm1 AUTO_OFFLINE Wed Jul 11 09:52:34 2018
      PrimProc pm1 AUTO_OFFLINE Wed Jul 11 09:52:34 2018
      WriteEngineServer pm1 AUTO_OFFLINE Wed Jul 11 09:52:34 2018
      ProcessMonitor pm2 ACTIVE Wed Jul 11 09:45:11 2018 15035
      ProcessManager pm2 ACTIVE Wed Jul 11 09:52:58 2018 15173
      DBRMControllerNode pm2 AUTO_OFFLINE Wed Jul 11 09:58:53 2018
      ServerMonitor pm2 ACTIVE Wed Jul 11 09:52:44 2018 16531
      DBRMWorkerNode pm2 ACTIVE Wed Jul 11 09:53:27 2018 17060
      DecomSvr pm2 ACTIVE Wed Jul 11 09:52:48 2018 16616
      PrimProc pm2 ACTIVE Wed Jul 11 09:54:06 2018 17493
      WriteEngineServer pm2 ACTIVE Wed Jul 11 09:54:20 2018 17726
      ProcessMonitor pm3 ACTIVE Wed Jul 11 09:45:12 2018 29271
      ProcessManager pm3 HOT_STANDBY Wed Jul 11 09:54:47 2018 30244
      DBRMControllerNode pm3 COLD_STANDBY Wed Jul 11 09:52:50 2018
      ServerMonitor pm3 ACTIVE Wed Jul 11 09:45:37 2018 29528
      DBRMWorkerNode pm3 MAN_OFFLINE Wed Jul 11 09:53:57 2018
      DecomSvr pm3 ACTIVE Wed Jul 11 09:45:41 2018 29570
      PrimProc pm3 ACTIVE Wed Jul 11 09:54:07 2018 30144
      WriteEngineServer pm3 ACTIVE Wed Jul 11 09:54:21 2018 30211
      Active Alarm Counts: Critical = 2, Major = 3, Minor = 4, Warning = 0, Info = 0

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              shashank9898 Developer
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.