Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-3660

DBAAS: Columstore SingleNode System is out of service , not read/write capable , but it's mcs system status remains active and Pod recovering is not initiated

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Won't Do
    • 1.4.1
    • Icebox
    • ExeMgr, PrimProc
    • GKE
      columnstore.image=mariadb/enterprise-columnstore:1.4.1-1

    Description

      DBAAS: Columstore SingleNode System is out of service , not read/write capable , but it's mcs system status remains active and Pod recovering is not initiated

      How to repeat :
      1. Spin up Kubernates SingleNode Columstore Topology.
      2. Start continuously killing PrimProc from inside the PM's pod container
      Columnstore become out of service but MCS system status remains Active and PM Pod recovering is not triggered

      MariaDB [(none)]> select count(*) from  tpcds_100.web_site ;
      ERROR 1815 (HY000): Internal error: IDB-2004: Cannot connect to ExeMgr.
      

      1.Spin up Kubernates SingleNode Columstore Topology , check that mcs is operational

      [root@expmcsrcc001-mdb-cs-single-0 /]# mcsadmin getsystemi
      getsysteminfo   Tue Dec 10 15:42:38 2019
       
      System columnstore-1
       
      System and Module statuses
       
      Component     Status                       Last Status Change
      ------------  --------------------------   ------------------------
      System        ACTIVE                       Tue Dec 10 14:20:35 2019
       
      Module pm1    ACTIVE                       Tue Dec 10 14:20:33 2019
       
       
      MariaDB ColumnStore Process statuses
       
      Process             Module    Status            Last Status Change        Process ID
      ------------------  ------    ---------------   ------------------------  ----------
      ProcessMonitor      pm1       ACTIVE            Tue Dec 10 14:19:41 2019          94
      ProcessManager      pm1       ACTIVE            Tue Dec 10 14:19:48 2019         214
      StorageManager      pm1       ACTIVE            Tue Dec 10 14:19:54 2019         720
      DBRMControllerNode  pm1       ACTIVE            Tue Dec 10 14:20:13 2019         850
      ServerMonitor       pm1       ACTIVE            Tue Dec 10 14:20:14 2019         870
      DBRMWorkerNode      pm1       ACTIVE            Tue Dec 10 14:20:15 2019         890
      PrimProc            pm1       ACTIVE            Tue Dec 10 14:20:19 2019         974
      ExeMgr              pm1       ACTIVE            Tue Dec 10 14:20:23 2019        1093
      WriteEngineServer   pm1       ACTIVE            Tue Dec 10 14:20:27 2019        1182
      DDLProc             pm1       ACTIVE            Tue Dec 10 14:20:31 2019        1230
      DMLProc             pm1       ACTIVE            Tue Dec 10 14:20:35 2019        1285
      mysqld              pm1       ACTIVE            Tue Dec 10 14:19:53 2019         523
       
      Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0
      

      2.

       mcsadmin getsystemi
      getsysteminfo   Tue Dec 10 15:46:41 2019
       
      System columnstore-1
       
      System and Module statuses
       
      Component     Status                       Last Status Change
      ------------  --------------------------   ------------------------
      System        ACTIVE                       Tue Dec 10 15:46:30 2019
       
      Module pm1    ACTIVE                       Tue Dec 10 15:46:07 2019
       
       
      MariaDB ColumnStore Process statuses
       
      Process             Module    Status            Last Status Change        Process ID
      ------------------  ------    ---------------   ------------------------  ----------
      ProcessMonitor      pm1       ACTIVE            Tue Dec 10 14:19:41 2019          94
      ProcessManager      pm1       ACTIVE            Tue Dec 10 14:19:48 2019         214
      StorageManager      pm1       ACTIVE            Tue Dec 10 14:19:54 2019         720
      DBRMControllerNode  pm1       ACTIVE            Tue Dec 10 14:20:13 2019         850
      ServerMonitor       pm1       ACTIVE            Tue Dec 10 14:20:14 2019         870
      DBRMWorkerNode      pm1       ACTIVE            Tue Dec 10 14:20:15 2019         890
      PrimProc            pm1       AUTO_OFFLINE      Tue Dec 10 15:46:04 2019
      ExeMgr              pm1       MAN_OFFLINE       Tue Dec 10 15:46:27 2019
      WriteEngineServer   pm1       ACTIVE            Tue Dec 10 14:20:27 2019        1182
      DDLProc             pm1       ACTIVE            Tue Dec 10 14:20:31 2019        1230
      DMLProc             pm1       ACTIVE            Tue Dec 10 14:20:35 2019        1285
      mysqld              pm1       ACTIVE            Tue Dec 10 14:19:53 2019         523
      
      

      Columnstore System is out of service

      MariaDB [(none)]> select count(*) from  tpcds_100.web_site ;
      ERROR 1815 (HY000): Internal error: IDB-2004: Cannot connect to ExeMgr.
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            winstone Zdravelina Sokolovska (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.