Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-3660

DBAAS: Columstore SingleNode System is out of service , not read/write capable , but it's mcs system status remains active and Pod recovering is not initiated

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Won't Do
    • 1.4.1
    • Icebox
    • ExeMgr, PrimProc
    • GKE
      columnstore.image=mariadb/enterprise-columnstore:1.4.1-1

    Description

      DBAAS: Columstore SingleNode System is out of service , not read/write capable , but it's mcs system status remains active and Pod recovering is not initiated

      How to repeat :
      1. Spin up Kubernates SingleNode Columstore Topology.
      2. Start continuously killing PrimProc from inside the PM's pod container
      Columnstore become out of service but MCS system status remains Active and PM Pod recovering is not triggered

      MariaDB [(none)]> select count(*) from  tpcds_100.web_site ;
      ERROR 1815 (HY000): Internal error: IDB-2004: Cannot connect to ExeMgr.
      

      1.Spin up Kubernates SingleNode Columstore Topology , check that mcs is operational

      [root@expmcsrcc001-mdb-cs-single-0 /]# mcsadmin getsystemi
      getsysteminfo   Tue Dec 10 15:42:38 2019
       
      System columnstore-1
       
      System and Module statuses
       
      Component     Status                       Last Status Change
      ------------  --------------------------   ------------------------
      System        ACTIVE                       Tue Dec 10 14:20:35 2019
       
      Module pm1    ACTIVE                       Tue Dec 10 14:20:33 2019
       
       
      MariaDB ColumnStore Process statuses
       
      Process             Module    Status            Last Status Change        Process ID
      ------------------  ------    ---------------   ------------------------  ----------
      ProcessMonitor      pm1       ACTIVE            Tue Dec 10 14:19:41 2019          94
      ProcessManager      pm1       ACTIVE            Tue Dec 10 14:19:48 2019         214
      StorageManager      pm1       ACTIVE            Tue Dec 10 14:19:54 2019         720
      DBRMControllerNode  pm1       ACTIVE            Tue Dec 10 14:20:13 2019         850
      ServerMonitor       pm1       ACTIVE            Tue Dec 10 14:20:14 2019         870
      DBRMWorkerNode      pm1       ACTIVE            Tue Dec 10 14:20:15 2019         890
      PrimProc            pm1       ACTIVE            Tue Dec 10 14:20:19 2019         974
      ExeMgr              pm1       ACTIVE            Tue Dec 10 14:20:23 2019        1093
      WriteEngineServer   pm1       ACTIVE            Tue Dec 10 14:20:27 2019        1182
      DDLProc             pm1       ACTIVE            Tue Dec 10 14:20:31 2019        1230
      DMLProc             pm1       ACTIVE            Tue Dec 10 14:20:35 2019        1285
      mysqld              pm1       ACTIVE            Tue Dec 10 14:19:53 2019         523
       
      Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0
      

      2.

       mcsadmin getsystemi
      getsysteminfo   Tue Dec 10 15:46:41 2019
       
      System columnstore-1
       
      System and Module statuses
       
      Component     Status                       Last Status Change
      ------------  --------------------------   ------------------------
      System        ACTIVE                       Tue Dec 10 15:46:30 2019
       
      Module pm1    ACTIVE                       Tue Dec 10 15:46:07 2019
       
       
      MariaDB ColumnStore Process statuses
       
      Process             Module    Status            Last Status Change        Process ID
      ------------------  ------    ---------------   ------------------------  ----------
      ProcessMonitor      pm1       ACTIVE            Tue Dec 10 14:19:41 2019          94
      ProcessManager      pm1       ACTIVE            Tue Dec 10 14:19:48 2019         214
      StorageManager      pm1       ACTIVE            Tue Dec 10 14:19:54 2019         720
      DBRMControllerNode  pm1       ACTIVE            Tue Dec 10 14:20:13 2019         850
      ServerMonitor       pm1       ACTIVE            Tue Dec 10 14:20:14 2019         870
      DBRMWorkerNode      pm1       ACTIVE            Tue Dec 10 14:20:15 2019         890
      PrimProc            pm1       AUTO_OFFLINE      Tue Dec 10 15:46:04 2019
      ExeMgr              pm1       MAN_OFFLINE       Tue Dec 10 15:46:27 2019
      WriteEngineServer   pm1       ACTIVE            Tue Dec 10 14:20:27 2019        1182
      DDLProc             pm1       ACTIVE            Tue Dec 10 14:20:31 2019        1230
      DMLProc             pm1       ACTIVE            Tue Dec 10 14:20:35 2019        1285
      mysqld              pm1       ACTIVE            Tue Dec 10 14:19:53 2019         523
      
      

      Columnstore System is out of service

      MariaDB [(none)]> select count(*) from  tpcds_100.web_site ;
      ERROR 1815 (HY000): Internal error: IDB-2004: Cannot connect to ExeMgr.
      

      Attachments

        Activity

          Can you please attach a ColumnStore support report for this?

          LinuxJedi Andrew Hutchings (Inactive) added a comment - Can you please attach a ColumnStore support report for this?

          attached columnstoreSupportReport.columnstore-1.tar.gz , and logs below while getting the ColumnStore support report

          Get software report data for pm1
          Get config report data for pm1
           
          Note: This output shows SysV services only and does not include native
                systemd services. SysV configuration data might be overridden by native
                systemd configuration.
           
                If you want to list systemd services use 'systemctl list-unit-files'.
                To see services enabled on particular target use
                'systemctl list-dependencies [target]'.
           
           
          Note: This output shows SysV services only and does not include native
                systemd services. SysV configuration data might be overridden by native
                systemd configuration.
           
                If you want to list systemd services use 'systemctl list-unit-files'.
                To see services enabled on particular target use
                'systemctl list-dependencies [target]'.
           
          Get log report data for pm1
          Get log config data for pm1
          Get bulklog report data for pm1
          Get hardware report data for pm1
          Get resource report data for pm1
          Get dbms report data for pm1
          ERROR 1815 (HY000) at line 4: Internal error: IDB-2004: Cannot connect to ExeMgr.
          ERROR 1815 (HY000) at line 1: Internal error: IDB-2004: Cannot connect to ExeMgr.
           
          Columnstore Support Script Successfully completed, files located in columnstoreSupportReport.columnstore-1.tar.gz
          

          winstone Zdravelina Sokolovska (Inactive) added a comment - attached columnstoreSupportReport.columnstore-1.tar.gz , and logs below while getting the ColumnStore support report Get software report data for pm1 Get config report data for pm1   Note: This output shows SysV services only and does not include native systemd services. SysV configuration data might be overridden by native systemd configuration.   If you want to list systemd services use 'systemctl list-unit-files'. To see services enabled on particular target use 'systemctl list-dependencies [target]'.     Note: This output shows SysV services only and does not include native systemd services. SysV configuration data might be overridden by native systemd configuration.   If you want to list systemd services use 'systemctl list-unit-files'. To see services enabled on particular target use 'systemctl list-dependencies [target]'.   Get log report data for pm1 Get log config data for pm1 Get bulklog report data for pm1 Get hardware report data for pm1 Get resource report data for pm1 Get dbms report data for pm1 ERROR 1815 (HY000) at line 4: Internal error: IDB-2004: Cannot connect to ExeMgr. ERROR 1815 (HY000) at line 1: Internal error: IDB-2004: Cannot connect to ExeMgr.   Columnstore Support Script Successfully completed, files located in columnstoreSupportReport.columnstore-1.tar.gz

          This ticket was created prior to convergence with the server and may be obsolete. If you find this issue still exists in a modern version, please open a new ticket.

          toddstoffel Todd Stoffel (Inactive) added a comment - This ticket was created prior to convergence with the server and may be obsolete. If you find this issue still exists in a modern version, please open a new ticket.

          People

            Unassigned Unassigned
            winstone Zdravelina Sokolovska (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.