[MCOL-3281] Mariadb columnstore System is given in ACTIVE Status nevertheless many cs processes are not active and sql returns internal errors Created: 2019-04-24  Updated: 2023-10-26  Resolved: 2023-03-06

Status: Closed
Project: MariaDB ColumnStore
Component/s: DDLProc, DMLProc, ExeMgr, ProcMgr
Affects Version/s: 1.2.4
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Zdravelina Sokolovska (Inactive) Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Environment:

1UM-2PMs ; CentOS7



 Description   

Mariadb columnstore System is given in ACTIVE Status nevertheless many cs processes are not active and sql returns internal errors

expected : mcsadmin getsystemstatus returns ACTIVE Status for columnstore System
when all cs functionality is active , respectively the expected mcs processes on columnstore Modules are Active.

how to repeat :
Problem was observed in 1UM-2PMs cs installation with enabled local query .System was functional , then reboot the UM module ;
check the MariaDB columnstore status with the mcsadmin utility – it's returned Active
but when it was performed from the UM even simple math sql statement or try to create table with engine columnstore are returned Errors

[root@cps ~]#  mcsadmin getsystemstatus
 
WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
         Access Console from 'pm1' if you need to make changes.
 
getsystemstatus   Wed Apr 24 12:40:42 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        ACTIVE                       Wed Apr 24 12:19:19 2019
 
Module um1    ACTIVE                       Wed Apr 24 12:19:19 2019
Module pm1    ACTIVE                       Wed Apr 24 12:19:19 2019
Module pm2    ACTIVE                       Wed Apr 24 12:19:20 2019
 
Active Parent OAM Performance Module is 'pm1'
Local Query Feature is enabled
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 

MariaDB [(none)]> select 16200 * 8 ;
ERROR 1044 (42000): Access denied for user ''@'localhost' to database 'infinidb_vtable'

MariaDB [(none)]> create table a1.a1 ( a int) engine columnstore ;
ERROR 1815 (HY000): Internal error: Lost connection to DDLProc

get detailed information with mcsadmin about the system status
it's seen that those are many cs processes in AUTO_OFFLINE state but the entire system status is wrongly considered Active

[root@cps ~]# mcsadmin getsystemi;
 
WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
         Access Console from 'pm1' if you need to make changes.
 
getsysteminfo   Wed Apr 24 12:39:08 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        ACTIVE                       Wed Apr 24 12:19:19 2019
 
Module um1    ACTIVE                       Wed Apr 24 12:19:19 2019
Module pm1    ACTIVE                       Wed Apr 24 12:19:19 2019
Module pm2    ACTIVE                       Wed Apr 24 12:19:20 2019
 
Active Parent OAM Performance Module is 'pm1'
Local Query Feature is enabled
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
ProcessMonitor      um1       ACTIVE            Wed Apr 24 12:18:20 2019        5836
ServerMonitor       um1       AUTO_OFFLINE      Wed Apr 24 12:16:04 2019
DBRMWorkerNode      um1       ACTIVE            Wed Apr 24 12:18:36 2019        6090
ExeMgr              um1       ACTIVE            Wed Apr 24 12:19:07 2019        6313
DDLProc             um1       AUTO_OFFLINE      Wed Apr 24 12:16:04 2019
DMLProc             um1       AUTO_OFFLINE      Wed Apr 24 12:16:04 2019
mysqld              um1       ACTIVE            Wed Apr 24 12:19:19 2019        6571
 
ProcessMonitor      pm1       ACTIVE            Mon Apr 22 13:10:43 2019       18256
ProcessManager      pm1       ACTIVE            Mon Apr 22 13:10:49 2019       18399
DBRMControllerNode  pm1       ACTIVE            Wed Apr 24 12:18:33 2019         753
ServerMonitor       pm1       ACTIVE            Mon Apr 22 13:11:44 2019       20176
DBRMWorkerNode      pm1       ACTIVE            Wed Apr 24 12:18:40 2019         923
PrimProc            pm1       ACTIVE            Wed Apr 24 12:18:49 2019        1063
ExeMgr              pm1       MAN_OFFLINE       Wed Apr 24 12:13:42 2019
WriteEngineServer   pm1       ACTIVE            Wed Apr 24 12:19:02 2019        1324
mysqld              pm1       ACTIVE            Wed Apr 24 12:19:19 2019        1770
 
ProcessMonitor      pm2       ACTIVE            Mon Apr 22 13:11:35 2019       28401
ProcessManager      pm2       HOT_STANDBY       Mon Apr 22 13:11:36 2019       28515
DBRMControllerNode  pm2       COLD_STANDBY      Wed Apr 24 12:18:32 2019
ServerMonitor       pm2       ACTIVE            Mon Apr 22 13:11:53 2019       28902
DBRMWorkerNode      pm2       ACTIVE            Wed Apr 24 12:18:45 2019       25533
PrimProc            pm2       ACTIVE            Wed Apr 24 12:18:53 2019       25619
ExeMgr              pm2       MAN_OFFLINE       Wed Apr 24 12:13:42 2019
WriteEngineServer   pm2       ACTIVE            Wed Apr 24 12:19:03 2019       25747
mysqld              pm2       ACTIVE            Wed Apr 24 12:19:20 2019       26074
 
Active Alarm Counts: Critical = 0, Major = 3, Minor = 2, Warning = 0, Info = 0
 



 Comments   
Comment by Todd Stoffel (Inactive) [ 2023-03-06 ]

This ticket was created prior to convergence with the server and may be obsolete. If you find this issue still exists in a modern version, please open a new ticket.

Generated at Thu Feb 08 02:41:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.