[MCOL-3564] um1 can't recover from `columnstore stop && columnstore start` Created: 2019-10-17  Updated: 2019-12-17  Resolved: 2019-11-27

Status: Closed
Project: MariaDB ColumnStore
Component/s: MariaDB Server
Affects Version/s: 1.4.0
Fix Version/s: Icebox

Type: Bug Priority: Critical
Reporter: Jens Röwekamp (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Won't Fix Votes: 0
Labels: SkySQLMVP
Environment:

CentOS 7 - 1UM 2PM multi node cluster - on VirtualBox

gitversionEngine: 1f47534


Attachments: File 1UM 2PM - PM1 columnstore stop - columnstore start - mcsadmin startsystem.tar     File MCOL-3564 - 6b91667 - VirtualBox.tar     File support-reports.tar    
Issue Links:
Duplicate
duplicates MCOL-3589 Taken down module status doesn't prop... Closed

 Description   

After the creation of a 1UM 2PM multi node cluster (root installation) UM1 won't recover from a restart of the ColumnStore daemon.

Executed commands:

[root@um1 jens]# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1 125320  3832 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd --switched-root --system --deserialize 22
root         2  0.0  0.0      0     0 ?        S    05:06   0:00 [kthreadd]
root         4  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/0:0H]
root         5  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/u4:0]
root         6  0.0  0.0      0     0 ?        S    05:06   0:00 [ksoftirqd/0]
root         7  0.0  0.0      0     0 ?        S    05:06   0:00 [migration/0]
root         8  0.0  0.0      0     0 ?        S    05:06   0:00 [rcu_bh]
root         9  0.0  0.0      0     0 ?        R    05:06   0:00 [rcu_sched]
root        10  0.0  0.0      0     0 ?        S<   05:06   0:00 [lru-add-drain]
root        11  0.0  0.0      0     0 ?        S    05:06   0:00 [watchdog/0]
root        12  0.0  0.0      0     0 ?        S    05:06   0:00 [watchdog/1]
root        13  0.0  0.0      0     0 ?        S    05:06   0:00 [migration/1]
root        14  0.0  0.0      0     0 ?        S    05:06   0:00 [ksoftirqd/1]
root        16  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/1:0H]
root        18  0.0  0.0      0     0 ?        S    05:06   0:00 [kdevtmpfs]
root        19  0.0  0.0      0     0 ?        S<   05:06   0:00 [netns]
root        20  0.0  0.0      0     0 ?        S    05:06   0:00 [khungtaskd]
root        21  0.0  0.0      0     0 ?        S<   05:06   0:00 [writeback]
root        22  0.0  0.0      0     0 ?        S<   05:06   0:00 [kintegrityd]
root        23  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        24  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        25  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        26  0.0  0.0      0     0 ?        S<   05:06   0:00 [kblockd]
root        27  0.0  0.0      0     0 ?        S<   05:06   0:00 [md]
root        28  0.0  0.0      0     0 ?        S<   05:06   0:00 [edac-poller]
root        29  0.0  0.0      0     0 ?        S<   05:06   0:00 [watchdogd]
root        30  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/0:1]
root        35  0.0  0.0      0     0 ?        S    05:06   0:00 [kswapd0]
root        36  0.0  0.0      0     0 ?        SN   05:06   0:00 [ksmd]
root        37  0.0  0.0      0     0 ?        SN   05:06   0:00 [khugepaged]
root        38  0.0  0.0      0     0 ?        S<   05:06   0:00 [crypto]
root        46  0.0  0.0      0     0 ?        S<   05:06   0:00 [kthrotld]
root        48  0.0  0.0      0     0 ?        S<   05:06   0:00 [kmpath_rdacd]
root        49  0.0  0.0      0     0 ?        S<   05:06   0:00 [kaluad]
root        50  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/1:1]
root        51  0.0  0.0      0     0 ?        S<   05:06   0:00 [kpsmoused]
root        52  0.0  0.0      0     0 ?        S<   05:06   0:00 [ipv6_addrconf]
root        66  0.0  0.0      0     0 ?        S<   05:06   0:00 [deferwq]
root       102  0.0  0.0      0     0 ?        S    05:06   0:00 [kauditd]
root       259  0.0  0.0      0     0 ?        S<   05:06   0:00 [ata_sff]
root       296  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_0]
root       297  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_0]
root       298  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_1]
root       299  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_1]
root       301  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_2]
root       302  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_2]
root       303  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/u4:3]
root       318  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/0:1H]
root       325  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root       326  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfsalloc]
root       327  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs_mru_cache]
root       328  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-buf/sda1]
root       329  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-data/sda1]
root       330  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-conv/sda1]
root       331  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-cil/sda1]
root       332  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-reclaim/sda]
root       333  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-log/sda1]
root       334  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-eofblocks/s]
root       335  0.0  0.0      0     0 ?        S    05:06   0:00 [xfsaild/sda1]
root       336  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/1:1H]
root       415  0.0  0.1  39080  3268 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-journald
root       434  0.0  0.0  44760  1884 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-udevd
root       435  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/1:2]
root       510  0.0  0.0  55528   892 ?        S<sl 05:06   0:00 /sbin/auditd
root       605  0.0  0.0  21536  1224 ?        Ss   05:06   0:00 /usr/sbin/irqbalance --foreground
polkitd    607  0.0  0.4 612244 14156 ?        Ssl  05:06   0:00 /usr/lib/polkit-1/polkitd --no-debug
dbus       608  0.0  0.0  58236  2412 ?        Ss   05:06   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       614  0.0  0.3 550184  8936 ?        Ssl  05:06   0:00 /usr/sbin/NetworkManager --no-daemon
root       615  0.0  0.0  26380  1748 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-logind
root       617  0.0  0.0 126292  1568 ?        Ss   05:06   0:00 /usr/sbin/crond -n
root       624  0.0  0.0 110108   856 tty1     Ss+  05:06   0:00 /sbin/agetty --noclear tty1 linux
root       661  0.0  0.1 102896  5520 ?        S    05:06   0:00 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp0s3.pid -lf /var/lib/NetworkManager/dhclient-21babde8-ca18-4d13-ae77-f865427ea90c-enp0s3.lease
root       884  0.0  0.5 574200 17424 ?        Ssl  05:06   0:00 /usr/bin/python2 -Es /usr/sbin/tuned -l -P
root       885  0.0  0.1 112920  4316 ?        Ss   05:06   0:00 /usr/sbin/sshd -D
root      8517  0.0  0.1 218548  3104 ?        Ssl  05:08   0:00 /usr/sbin/rsyslogd -n
root      8600  0.0  0.0 113184  1428 ?        S    05:08   0:00 /bin/bash /usr/local/mariadb/columnstore/bin/run.sh -l /tmp/columnstore_tmp_files /usr/local/mariadb/columnstore/bin/ProcMon
root      9128  4.0  1.2 530472 36592 ?        Sl   05:13   0:32 /usr/local/mariadb/columnstore/bin/ProcMon
root      9507  0.0  0.0 113320  1640 ?        S    05:14   0:00 /bin/sh /usr/local/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/usr/local/mariadb/columnstore/mysql/db --pid-file=/usr/local/mariadb/columnstore/mysql/db/um1.pid -
mysql     9670  0.0  3.6 1942984 105328 ?      Sl   05:14   0:00 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/m
root      9740  0.0  1.1 225876 32744 ?        Sl   05:14   0:00 /usr/local/mariadb/columnstore/bin/ServerMonitor
root      9756  0.0  1.3 245420 38312 ?        Sl   05:14   0:00 /usr/local/mariadb/columnstore/bin/workernode DBRM_Worker2 fg
root     10663  0.0  1.6 279076 47252 ?        S<l  05:14   0:00 [ExeMgr]
root     10681  0.0  1.1 240564 32156 ?        Sl   05:14   0:00 [DDLProc]
root     10721  0.0  1.4 299004 41076 ?        Sl   05:14   0:00 [DMLProc]
root     11923  0.0  0.0      0     0 ?        S    05:18   0:00 [kworker/0:2]
root     13519  0.0  0.0      0     0 ?        S    05:23   0:00 [kworker/0:0]
root     14309  0.2  0.2 161532  6172 ?        Ss   05:26   0:00 sshd: jens [priv]
root     14335  0.0  0.0      0     0 ?        R    05:26   0:00 [kworker/1:0]
jens     14336  0.0  0.0 161532  2336 ?        S    05:26   0:00 sshd: jens@pts/0
jens     14339  0.0  0.0 115448  1992 pts/0    Ss   05:26   0:00 -bash
root     14356  0.4  0.1 243976  5288 pts/0    S    05:26   0:00 sudo su
root     14357  0.0  0.0 191780  2348 pts/0    S    05:26   0:00 su
root     14358  0.0  0.0 115448  2052 pts/0    S    05:26   0:00 bash
root     14394  0.0  0.0 155372  1876 pts/0    R+   05:26   0:00 ps -aux

[root@um1 jens]# mcsadmin getsysteminfo
 
WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
         Access Console from 'pm1' if you need to make changes.
 
getsysteminfo   Thu Oct 17 05:26:50 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        ACTIVE                       Thu Oct 17 05:14:37 2019
 
Module um1    ACTIVE                       Thu Oct 17 05:14:34 2019
Module pm1    ACTIVE                       Thu Oct 17 05:14:15 2019
Module pm2    ACTIVE                       Thu Oct 17 05:14:24 2019
 
Active Parent OAM Performance Module is 'pm1'
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
ProcessMonitor      um1       ACTIVE            Thu Oct 17 05:13:44 2019        9128
ServerMonitor       um1       ACTIVE            Thu Oct 17 05:14:17 2019        9740
DBRMWorkerNode      um1       ACTIVE            Thu Oct 17 05:14:17 2019        9756
ExeMgr              um1       ACTIVE            Thu Oct 17 05:14:26 2019       10663
DDLProc             um1       ACTIVE            Thu Oct 17 05:14:30 2019       10681
DMLProc             um1       ACTIVE            Thu Oct 17 05:14:35 2019       10721
mysqld              um1       ACTIVE            Thu Oct 17 05:14:25 2019        9670
 
ProcessMonitor      pm1       ACTIVE            Thu Oct 17 05:12:37 2019        8720
ProcessManager      pm1       ACTIVE            Thu Oct 17 05:12:43 2019        8800
DBRMControllerNode  pm1       ACTIVE            Thu Oct 17 05:14:09 2019       10330
ServerMonitor       pm1       ACTIVE            Thu Oct 17 05:14:10 2019       10371
DBRMWorkerNode      pm1       ACTIVE            Thu Oct 17 05:14:11 2019       10407
PrimProc            pm1       ACTIVE            Thu Oct 17 05:14:15 2019       10532
WriteEngineServer   pm1       ACTIVE            Thu Oct 17 05:14:16 2019       10583
 
ProcessMonitor      pm2       ACTIVE            Thu Oct 17 05:14:02 2019        8682
ProcessManager      pm2       HOT_STANDBY       Thu Oct 17 05:14:03 2019        8726
DBRMControllerNode  pm2       COLD_STANDBY      Thu Oct 17 05:14:17 2019
ServerMonitor       pm2       ACTIVE            Thu Oct 17 05:14:20 2019        8759
DBRMWorkerNode      pm2       ACTIVE            Thu Oct 17 05:14:20 2019        8775
PrimProc            pm2       ACTIVE            Thu Oct 17 05:14:24 2019        8792
WriteEngineServer   pm2       ACTIVE            Thu Oct 17 05:14:25 2019        8806
 
Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0

[root@um1 jens]# /usr/local/mariadb/columnstore/bin/columnstore stop && sleep 60
Shutting down MariaDB Columnstore Database Platform

[root@um1 jens]# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1 125320  3832 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd --switched-root --system --deserialize 22
root         2  0.0  0.0      0     0 ?        S    05:06   0:00 [kthreadd]
root         4  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/0:0H]
root         5  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/u4:0]
root         6  0.0  0.0      0     0 ?        S    05:06   0:00 [ksoftirqd/0]
root         7  0.0  0.0      0     0 ?        S    05:06   0:00 [migration/0]
root         8  0.0  0.0      0     0 ?        S    05:06   0:00 [rcu_bh]
root         9  0.0  0.0      0     0 ?        R    05:06   0:00 [rcu_sched]
root        10  0.0  0.0      0     0 ?        S<   05:06   0:00 [lru-add-drain]
root        11  0.0  0.0      0     0 ?        S    05:06   0:00 [watchdog/0]
root        12  0.0  0.0      0     0 ?        S    05:06   0:00 [watchdog/1]
root        13  0.0  0.0      0     0 ?        S    05:06   0:00 [migration/1]
root        14  0.0  0.0      0     0 ?        S    05:06   0:00 [ksoftirqd/1]
root        16  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/1:0H]
root        18  0.0  0.0      0     0 ?        S    05:06   0:00 [kdevtmpfs]
root        19  0.0  0.0      0     0 ?        S<   05:06   0:00 [netns]
root        20  0.0  0.0      0     0 ?        S    05:06   0:00 [khungtaskd]
root        21  0.0  0.0      0     0 ?        S<   05:06   0:00 [writeback]
root        22  0.0  0.0      0     0 ?        S<   05:06   0:00 [kintegrityd]
root        23  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        24  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        25  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        26  0.0  0.0      0     0 ?        S<   05:06   0:00 [kblockd]
root        27  0.0  0.0      0     0 ?        S<   05:06   0:00 [md]
root        28  0.0  0.0      0     0 ?        S<   05:06   0:00 [edac-poller]
root        29  0.0  0.0      0     0 ?        S<   05:06   0:00 [watchdogd]
root        30  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/0:1]
root        35  0.0  0.0      0     0 ?        S    05:06   0:00 [kswapd0]
root        36  0.0  0.0      0     0 ?        SN   05:06   0:00 [ksmd]
root        37  0.0  0.0      0     0 ?        SN   05:06   0:00 [khugepaged]
root        38  0.0  0.0      0     0 ?        S<   05:06   0:00 [crypto]
root        46  0.0  0.0      0     0 ?        S<   05:06   0:00 [kthrotld]
root        48  0.0  0.0      0     0 ?        S<   05:06   0:00 [kmpath_rdacd]
root        49  0.0  0.0      0     0 ?        S<   05:06   0:00 [kaluad]
root        50  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/1:1]
root        51  0.0  0.0      0     0 ?        S<   05:06   0:00 [kpsmoused]
root        52  0.0  0.0      0     0 ?        S<   05:06   0:00 [ipv6_addrconf]
root        66  0.0  0.0      0     0 ?        S<   05:06   0:00 [deferwq]
root       102  0.0  0.0      0     0 ?        S    05:06   0:00 [kauditd]
root       259  0.0  0.0      0     0 ?        S<   05:06   0:00 [ata_sff]
root       296  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_0]
root       297  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_0]
root       298  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_1]
root       299  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_1]
root       301  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_2]
root       302  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_2]
root       303  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/u4:3]
root       318  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/0:1H]
root       325  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root       326  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfsalloc]
root       327  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs_mru_cache]
root       328  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-buf/sda1]
root       329  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-data/sda1]
root       330  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-conv/sda1]
root       331  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-cil/sda1]
root       332  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-reclaim/sda]
root       333  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-log/sda1]
root       334  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-eofblocks/s]
root       335  0.0  0.0      0     0 ?        S    05:06   0:00 [xfsaild/sda1]
root       336  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/1:1H]
root       415  0.0  0.1  39080  3268 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-journald
root       434  0.0  0.0  44760  1884 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-udevd
root       435  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/1:2]
root       510  0.0  0.0  55528   892 ?        S<sl 05:06   0:00 /sbin/auditd
root       605  0.0  0.0  21536  1224 ?        Ss   05:06   0:00 /usr/sbin/irqbalance --foreground
polkitd    607  0.0  0.4 612244 14156 ?        Ssl  05:06   0:00 /usr/lib/polkit-1/polkitd --no-debug
dbus       608  0.0  0.0  58236  2412 ?        Ss   05:06   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       614  0.0  0.3 550184  8936 ?        Ssl  05:06   0:00 /usr/sbin/NetworkManager --no-daemon
root       615  0.0  0.0  26380  1748 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-logind
root       617  0.0  0.0 126292  1568 ?        Ss   05:06   0:00 /usr/sbin/crond -n
root       624  0.0  0.0 110108   856 tty1     Ss+  05:06   0:00 /sbin/agetty --noclear tty1 linux
root       661  0.0  0.1 102896  5520 ?        S    05:06   0:00 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp0s3.pid -lf /var/lib/NetworkManager/dhclient-21babde8-ca18-4d13-ae77-f865427ea90c-enp0s3.lease
root       884  0.0  0.5 574200 17424 ?        Ssl  05:06   0:00 /usr/bin/python2 -Es /usr/sbin/tuned -l -P
root       885  0.0  0.1 112920  4316 ?        Ss   05:06   0:00 /usr/sbin/sshd -D
root      8517  0.0  0.1 218548  3104 ?        Ssl  05:08   0:00 /usr/sbin/rsyslogd -n
root      9740  0.0  1.1 225876 32752 ?        Sl   05:14   0:00 /usr/local/mariadb/columnstore/bin/ServerMonitor
root      9756  0.0  1.3 245420 38312 ?        Sl   05:14   0:00 /usr/local/mariadb/columnstore/bin/workernode DBRM_Worker2 fg
root     10663  0.0  1.6 279076 47252 ?        S<l  05:14   0:00 [ExeMgr]
root     10681  0.0  1.1 240564 32156 ?        Sl   05:14   0:00 [DDLProc]
root     10721  0.0  1.4 299004 41076 ?        Sl   05:14   0:00 [DMLProc]
root     11923  0.0  0.0      0     0 ?        S    05:18   0:00 [kworker/0:2]
root     13519  0.0  0.0      0     0 ?        S    05:23   0:00 [kworker/0:0]
root     14309  0.0  0.2 161532  6172 ?        Ss   05:26   0:00 sshd: jens [priv]
root     14335  0.0  0.0      0     0 ?        S    05:26   0:00 [kworker/1:0]
jens     14336  0.0  0.0 161532  2336 ?        D    05:26   0:00 sshd: jens@pts/0
jens     14339  0.0  0.0 115448  1992 pts/0    Ss   05:26   0:00 -bash
root     14356  0.0  0.1 243976  5288 pts/0    S    05:26   0:00 sudo su
root     14357  0.0  0.0 191780  2348 pts/0    S    05:26   0:00 su
root     14358  0.0  0.0 115448  2100 pts/0    S    05:26   0:00 bash
root     14499  0.0  0.0      0     0 ?        R    05:26   0:00 [kworker/0:3]
root     14885  0.0  0.0 155372  1876 pts/0    R+   05:29   0:00 ps -aux

[root@um1 jens]# mcsadmin getsysteminfo
 
WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
         Access Console from 'pm1' if you need to make changes.
 
getsysteminfo   Thu Oct 17 05:29:40 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        INITIAL
 
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
 
**** printProcessStatus Failed =  API Failure return in getProcessStatus API

[root@um1 jens]# /usr/local/mariadb/columnstore/bin/columnstore start && sleep 300
Starting MariaDB Columnstore Database Platform

[root@um1 jens]# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1 125320  3832 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd --switched-root --system --deserialize 22
root         2  0.0  0.0      0     0 ?        S    05:06   0:00 [kthreadd]
root         4  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/0:0H]
root         5  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/u4:0]
root         6  0.0  0.0      0     0 ?        S    05:06   0:00 [ksoftirqd/0]
root         7  0.0  0.0      0     0 ?        S    05:06   0:00 [migration/0]
root         8  0.0  0.0      0     0 ?        S    05:06   0:00 [rcu_bh]
root         9  0.0  0.0      0     0 ?        S    05:06   0:00 [rcu_sched]
root        10  0.0  0.0      0     0 ?        S<   05:06   0:00 [lru-add-drain]
root        11  0.0  0.0      0     0 ?        S    05:06   0:00 [watchdog/0]
root        12  0.0  0.0      0     0 ?        S    05:06   0:00 [watchdog/1]
root        13  0.0  0.0      0     0 ?        S    05:06   0:00 [migration/1]
root        14  0.0  0.0      0     0 ?        S    05:06   0:00 [ksoftirqd/1]
root        16  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/1:0H]
root        18  0.0  0.0      0     0 ?        S    05:06   0:00 [kdevtmpfs]
root        19  0.0  0.0      0     0 ?        S<   05:06   0:00 [netns]
root        20  0.0  0.0      0     0 ?        S    05:06   0:00 [khungtaskd]
root        21  0.0  0.0      0     0 ?        S<   05:06   0:00 [writeback]
root        22  0.0  0.0      0     0 ?        S<   05:06   0:00 [kintegrityd]
root        23  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        24  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        25  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        26  0.0  0.0      0     0 ?        S<   05:06   0:00 [kblockd]
root        27  0.0  0.0      0     0 ?        S<   05:06   0:00 [md]
root        28  0.0  0.0      0     0 ?        S<   05:06   0:00 [edac-poller]
root        29  0.0  0.0      0     0 ?        S<   05:06   0:00 [watchdogd]
root        35  0.0  0.0      0     0 ?        S    05:06   0:00 [kswapd0]
root        36  0.0  0.0      0     0 ?        SN   05:06   0:00 [ksmd]
root        37  0.0  0.0      0     0 ?        SN   05:06   0:00 [khugepaged]
root        38  0.0  0.0      0     0 ?        S<   05:06   0:00 [crypto]
root        46  0.0  0.0      0     0 ?        S<   05:06   0:00 [kthrotld]
root        48  0.0  0.0      0     0 ?        S<   05:06   0:00 [kmpath_rdacd]
root        49  0.0  0.0      0     0 ?        S<   05:06   0:00 [kaluad]
root        50  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/1:1]
root        51  0.0  0.0      0     0 ?        S<   05:06   0:00 [kpsmoused]
root        52  0.0  0.0      0     0 ?        S<   05:06   0:00 [ipv6_addrconf]
root        66  0.0  0.0      0     0 ?        S<   05:06   0:00 [deferwq]
root       102  0.0  0.0      0     0 ?        S    05:06   0:00 [kauditd]
root       259  0.0  0.0      0     0 ?        S<   05:06   0:00 [ata_sff]
root       296  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_0]
root       297  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_0]
root       298  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_1]
root       299  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_1]
root       301  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_2]
root       302  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_2]
root       303  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/u4:3]
root       318  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/0:1H]
root       325  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root       326  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfsalloc]
root       327  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs_mru_cache]
root       328  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-buf/sda1]
root       329  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-data/sda1]
root       330  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-conv/sda1]
root       331  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-cil/sda1]
root       332  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-reclaim/sda]
root       333  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-log/sda1]
root       334  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-eofblocks/s]
root       335  0.0  0.0      0     0 ?        S    05:06   0:00 [xfsaild/sda1]
root       336  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/1:1H]
root       415  0.0  0.1  39080  3284 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-journald
root       434  0.0  0.0  44760  1884 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-udevd
root       510  0.0  0.0  55528   892 ?        S<sl 05:06   0:00 /sbin/auditd
root       605  0.0  0.0  21536  1224 ?        Ss   05:06   0:00 /usr/sbin/irqbalance --foreground
polkitd    607  0.0  0.4 612244 14156 ?        Ssl  05:06   0:00 /usr/lib/polkit-1/polkitd --no-debug
dbus       608  0.0  0.0  58236  2412 ?        Ss   05:06   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       614  0.0  0.3 550184  8936 ?        Ssl  05:06   0:00 /usr/sbin/NetworkManager --no-daemon
root       615  0.0  0.0  26380  1748 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-logind
root       617  0.0  0.0 126292  1568 ?        Ss   05:06   0:00 /usr/sbin/crond -n
root       624  0.0  0.0 110108   856 tty1     Ss+  05:06   0:00 /sbin/agetty --noclear tty1 linux
root       661  0.0  0.1 102896  5520 ?        S    05:06   0:00 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp0s3.pid -lf /var/lib/NetworkManager/dhclient-21babde8-ca18-4d13-ae77-f865427ea90c-enp0s3.lease
root       884  0.0  0.5 574200 17424 ?        Ssl  05:06   0:00 /usr/bin/python2 -Es /usr/sbin/tuned -l -P
root       885  0.0  0.1 112920  4316 ?        Ss   05:06   0:00 /usr/sbin/sshd -D
root      8517  0.0  0.1 218548  3120 ?        Ssl  05:08   0:00 /usr/sbin/rsyslogd -n
root      9740  0.0  1.1 225876 32740 ?        Sl   05:14   0:00 /usr/local/mariadb/columnstore/bin/ServerMonitor
root      9756  0.0  1.3 245420 38312 ?        Sl   05:14   0:00 /usr/local/mariadb/columnstore/bin/workernode DBRM_Worker2 fg
root     10663  0.0  1.6 279076 47252 ?        S<l  05:14   0:00 [ExeMgr]
root     10681  0.0  1.1 240564 32156 ?        Sl   05:14   0:00 [DDLProc]
root     10721  0.0  1.4 299004 41076 ?        Sl   05:14   0:00 [DMLProc]
root     11923  0.0  0.0      0     0 ?        S    05:18   0:00 [kworker/0:2]
root     13519  0.0  0.0      0     0 ?        S    05:23   0:00 [kworker/0:0]
root     14309  0.0  0.2 161532  6172 ?        Ss   05:26   0:00 sshd: jens [priv]
root     14335  0.0  0.0      0     0 ?        S    05:26   0:00 [kworker/1:0]
jens     14336  0.0  0.0 161532  2336 ?        D    05:26   0:00 sshd: jens@pts/0
jens     14339  0.0  0.0 115448  1992 pts/0    Ss   05:26   0:00 -bash
root     14356  0.0  0.1 243976  5288 pts/0    S    05:26   0:00 sudo su
root     14357  0.0  0.0 191780  2348 pts/0    S    05:26   0:00 su
root     14358  0.0  0.0 115448  2104 pts/0    S    05:26   0:00 bash
root     14499  0.0  0.0      0     0 ?        R    05:26   0:00 [kworker/0:3]
root     14918  0.0  0.0 113184  1432 pts/0    S    05:30   0:00 /bin/bash /usr/local/mariadb/columnstore/bin/run.sh -l /tmp/columnstore_tmp_files /usr/local/mariadb/columnstore/bin/ProcMon
root     14920  9.4  1.0 242448 30812 pts/0    Sl   05:30   0:29 /usr/local/mariadb/columnstore/bin/ProcMon
root     16685  0.0  0.0 155372  1872 pts/0    R+   05:35   0:00 ps -aux

[root@um1 jens]# mcsadmin getsysteminfo
 
WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
         Access Console from 'pm1' if you need to make changes.
 
getsysteminfo   Thu Oct 17 05:35:54 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        ACTIVE                       Thu Oct 17 05:14:37 2019
 
Module um1    DEGRADED                     Thu Oct 17 05:30:10 2019
Module pm1    ACTIVE                       Thu Oct 17 05:14:15 2019
Module pm2    ACTIVE                       Thu Oct 17 05:14:24 2019
 
Active Parent OAM Performance Module is 'pm1'
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
ProcessMonitor      um1       ACTIVE            Thu Oct 17 05:30:39 2019       14920
ServerMonitor       um1       ACTIVE            Thu Oct 17 05:14:17 2019        9740
DBRMWorkerNode      um1       ACTIVE            Thu Oct 17 05:14:17 2019        9756
ExeMgr              um1       ACTIVE            Thu Oct 17 05:14:26 2019       10663
DDLProc             um1       ACTIVE            Thu Oct 17 05:14:30 2019       10681
DMLProc             um1       ACTIVE            Thu Oct 17 05:14:35 2019       10721
mysqld              um1       MAN_OFFLINE       Thu Oct 17 05:30:10 2019
 
ProcessMonitor      pm1       ACTIVE            Thu Oct 17 05:12:37 2019        8720
ProcessManager      pm1       ACTIVE            Thu Oct 17 05:12:43 2019        8800
DBRMControllerNode  pm1       ACTIVE            Thu Oct 17 05:14:09 2019       10330
ServerMonitor       pm1       ACTIVE            Thu Oct 17 05:14:10 2019       10371
DBRMWorkerNode      pm1       ACTIVE            Thu Oct 17 05:14:11 2019       10407
PrimProc            pm1       ACTIVE            Thu Oct 17 05:14:15 2019       10532
WriteEngineServer   pm1       ACTIVE            Thu Oct 17 05:14:16 2019       10583
 
ProcessMonitor      pm2       ACTIVE            Thu Oct 17 05:14:02 2019        8682
ProcessManager      pm2       HOT_STANDBY       Thu Oct 17 05:14:03 2019        8726
DBRMControllerNode  pm2       COLD_STANDBY      Thu Oct 17 05:14:17 2019
ServerMonitor       pm2       ACTIVE            Thu Oct 17 05:14:20 2019        8759
DBRMWorkerNode      pm2       ACTIVE            Thu Oct 17 05:14:20 2019        8775
PrimProc            pm2       ACTIVE            Thu Oct 17 05:14:24 2019        8792
WriteEngineServer   pm2       ACTIVE            Thu Oct 17 05:14:25 2019        8806
 
Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0

[root@um1 jens]# /usr/local/mariadb/columnstore/mysql/mysql-Columnstore start
Starting MySQL. SUCCESS!

[root@um1 jens]# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1 125320  3832 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd --switched-root --system --deserialize 22
root         2  0.0  0.0      0     0 ?        S    05:06   0:00 [kthreadd]
root         4  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/0:0H]
root         5  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/u4:0]
root         6  0.0  0.0      0     0 ?        S    05:06   0:00 [ksoftirqd/0]
root         7  0.0  0.0      0     0 ?        S    05:06   0:00 [migration/0]
root         8  0.0  0.0      0     0 ?        S    05:06   0:00 [rcu_bh]
root         9  0.0  0.0      0     0 ?        S    05:06   0:00 [rcu_sched]
root        10  0.0  0.0      0     0 ?        S<   05:06   0:00 [lru-add-drain]
root        11  0.0  0.0      0     0 ?        S    05:06   0:00 [watchdog/0]
root        12  0.0  0.0      0     0 ?        S    05:06   0:00 [watchdog/1]
root        13  0.0  0.0      0     0 ?        S    05:06   0:00 [migration/1]
root        14  0.0  0.0      0     0 ?        S    05:06   0:00 [ksoftirqd/1]
root        16  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/1:0H]
root        18  0.0  0.0      0     0 ?        S    05:06   0:00 [kdevtmpfs]
root        19  0.0  0.0      0     0 ?        S<   05:06   0:00 [netns]
root        20  0.0  0.0      0     0 ?        S    05:06   0:00 [khungtaskd]
root        21  0.0  0.0      0     0 ?        S<   05:06   0:00 [writeback]
root        22  0.0  0.0      0     0 ?        S<   05:06   0:00 [kintegrityd]
root        23  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        24  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        25  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root        26  0.0  0.0      0     0 ?        S<   05:06   0:00 [kblockd]
root        27  0.0  0.0      0     0 ?        S<   05:06   0:00 [md]
root        28  0.0  0.0      0     0 ?        S<   05:06   0:00 [edac-poller]
root        29  0.0  0.0      0     0 ?        S<   05:06   0:00 [watchdogd]
root        35  0.0  0.0      0     0 ?        S    05:06   0:00 [kswapd0]
root        36  0.0  0.0      0     0 ?        SN   05:06   0:00 [ksmd]
root        37  0.0  0.0      0     0 ?        SN   05:06   0:00 [khugepaged]
root        38  0.0  0.0      0     0 ?        S<   05:06   0:00 [crypto]
root        46  0.0  0.0      0     0 ?        S<   05:06   0:00 [kthrotld]
root        48  0.0  0.0      0     0 ?        S<   05:06   0:00 [kmpath_rdacd]
root        49  0.0  0.0      0     0 ?        S<   05:06   0:00 [kaluad]
root        50  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/1:1]
root        51  0.0  0.0      0     0 ?        S<   05:06   0:00 [kpsmoused]
root        52  0.0  0.0      0     0 ?        S<   05:06   0:00 [ipv6_addrconf]
root        66  0.0  0.0      0     0 ?        S<   05:06   0:00 [deferwq]
root       102  0.0  0.0      0     0 ?        S    05:06   0:00 [kauditd]
root       259  0.0  0.0      0     0 ?        S<   05:06   0:00 [ata_sff]
root       296  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_0]
root       297  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_0]
root       298  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_1]
root       299  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_1]
root       301  0.0  0.0      0     0 ?        S    05:06   0:00 [scsi_eh_2]
root       302  0.0  0.0      0     0 ?        S<   05:06   0:00 [scsi_tmf_2]
root       303  0.0  0.0      0     0 ?        S    05:06   0:00 [kworker/u4:3]
root       318  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/0:1H]
root       325  0.0  0.0      0     0 ?        S<   05:06   0:00 [bioset]
root       326  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfsalloc]
root       327  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs_mru_cache]
root       328  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-buf/sda1]
root       329  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-data/sda1]
root       330  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-conv/sda1]
root       331  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-cil/sda1]
root       332  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-reclaim/sda]
root       333  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-log/sda1]
root       334  0.0  0.0      0     0 ?        S<   05:06   0:00 [xfs-eofblocks/s]
root       335  0.0  0.0      0     0 ?        S    05:06   0:00 [xfsaild/sda1]
root       336  0.0  0.0      0     0 ?        S<   05:06   0:00 [kworker/1:1H]
root       415  0.0  0.1  39080  3284 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-journald
root       434  0.0  0.0  44760  1884 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-udevd
root       510  0.0  0.0  55528   892 ?        S<sl 05:06   0:00 /sbin/auditd
root       605  0.0  0.0  21536  1224 ?        Ss   05:06   0:00 /usr/sbin/irqbalance --foreground
polkitd    607  0.0  0.4 612244 14156 ?        Ssl  05:06   0:00 /usr/lib/polkit-1/polkitd --no-debug
dbus       608  0.0  0.0  58236  2412 ?        Ss   05:06   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       614  0.0  0.3 550184  8936 ?        Ssl  05:06   0:00 /usr/sbin/NetworkManager --no-daemon
root       615  0.0  0.0  26380  1748 ?        Ss   05:06   0:00 /usr/lib/systemd/systemd-logind
root       617  0.0  0.0 126292  1568 ?        Ss   05:06   0:00 /usr/sbin/crond -n
root       624  0.0  0.0 110108   856 tty1     Ss+  05:06   0:00 /sbin/agetty --noclear tty1 linux
root       661  0.0  0.1 102896  5520 ?        S    05:06   0:00 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp0s3.pid -lf /var/lib/NetworkManager/dhclient-21babde8-ca18-4d13-ae77-f865427ea90c-enp0s3.lease
root       884  0.0  0.5 574200 17424 ?        Ssl  05:06   0:00 /usr/bin/python2 -Es /usr/sbin/tuned -l -P
root       885  0.0  0.1 112920  4316 ?        Ss   05:06   0:00 /usr/sbin/sshd -D
root      8517  0.0  0.1 218548  3120 ?        Ssl  05:08   0:00 /usr/sbin/rsyslogd -n
root      9740  0.0  1.1 225876 32560 ?        Sl   05:14   0:00 /usr/local/mariadb/columnstore/bin/ServerMonitor
root      9756  0.0  1.3 245420 38312 ?        Sl   05:14   0:00 /usr/local/mariadb/columnstore/bin/workernode DBRM_Worker2 fg
root     10663  0.0  1.6 279076 47252 ?        S<l  05:14   0:00 [ExeMgr]
root     10681  0.0  1.1 240564 32156 ?        Sl   05:14   0:00 [DDLProc]
root     10721  0.0  1.4 299004 41076 ?        Sl   05:14   0:00 [DMLProc]
root     13519  0.0  0.0      0     0 ?        S    05:23   0:00 [kworker/0:0]
root     14309  0.0  0.2 161532  6172 ?        Ss   05:26   0:00 sshd: jens [priv]
root     14335  0.0  0.0      0     0 ?        R    05:26   0:00 [kworker/1:0]
jens     14336  0.0  0.0 161532  2336 ?        D    05:26   0:00 sshd: jens@pts/0
jens     14339  0.0  0.0 115448  1992 pts/0    Ss   05:26   0:00 -bash
root     14356  0.0  0.1 243976  5288 pts/0    S    05:26   0:00 sudo su
root     14357  0.0  0.0 191780  2348 pts/0    S    05:26   0:00 su
root     14358  0.0  0.0 115448  2104 pts/0    S    05:26   0:00 bash
root     14499  0.0  0.0      0     0 ?        S    05:26   0:00 [kworker/0:3]
root     14918  0.0  0.0 113184  1432 pts/0    S    05:30   0:00 /bin/bash /usr/local/mariadb/columnstore/bin/run.sh -l /tmp/columnstore_tmp_files /usr/local/mariadb/columnstore/bin/ProcMon
root     14920  7.7  1.0 242448 30816 pts/0    Sl   05:30   0:30 /usr/local/mariadb/columnstore/bin/ProcMon
root     17047  0.0  0.0 113320  1644 pts/0    S    05:36   0:00 /bin/sh /usr/local/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/usr/local/mariadb/columnstore/mysql/db --pid-file=/usr/local/mariadb/columnstore/mysql/db/um1.pid -
mysql    17210  1.5  3.3 1942984 98964 pts/0   Sl   05:36   0:00 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/m
root     17298  0.0  0.0 155372  1880 pts/0    R+   05:36   0:00 ps -aux

[root@um1 jens]# mcsadmin getsysteminfo
 
WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
         Access Console from 'pm1' if you need to make changes.
 
getsysteminfo   Thu Oct 17 05:36:50 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        ACTIVE                       Thu Oct 17 05:14:37 2019
 
Module um1    ACTIVE                       Thu Oct 17 05:36:31 2019
Module pm1    ACTIVE                       Thu Oct 17 05:14:15 2019
Module pm2    ACTIVE                       Thu Oct 17 05:14:24 2019
 
Active Parent OAM Performance Module is 'pm1'
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
ProcessMonitor      um1       ACTIVE            Thu Oct 17 05:30:39 2019       14920
ServerMonitor       um1       ACTIVE            Thu Oct 17 05:14:17 2019        9740
DBRMWorkerNode      um1       ACTIVE            Thu Oct 17 05:14:17 2019        9756
ExeMgr              um1       ACTIVE            Thu Oct 17 05:14:26 2019       10663
DDLProc             um1       ACTIVE            Thu Oct 17 05:14:30 2019       10681
DMLProc             um1       ACTIVE            Thu Oct 17 05:14:35 2019       10721
mysqld              um1       ACTIVE            Thu Oct 17 05:36:26 2019       17210
 
ProcessMonitor      pm1       ACTIVE            Thu Oct 17 05:12:37 2019        8720
ProcessManager      pm1       ACTIVE            Thu Oct 17 05:12:43 2019        8800
DBRMControllerNode  pm1       ACTIVE            Thu Oct 17 05:14:09 2019       10330
ServerMonitor       pm1       ACTIVE            Thu Oct 17 05:14:10 2019       10371
DBRMWorkerNode      pm1       ACTIVE            Thu Oct 17 05:14:11 2019       10407
PrimProc            pm1       ACTIVE            Thu Oct 17 05:14:15 2019       10532
WriteEngineServer   pm1       ACTIVE            Thu Oct 17 05:14:16 2019       10583
 
ProcessMonitor      pm2       ACTIVE            Thu Oct 17 05:14:02 2019        8682
ProcessManager      pm2       HOT_STANDBY       Thu Oct 17 05:14:03 2019        8726
DBRMControllerNode  pm2       COLD_STANDBY      Thu Oct 17 05:14:17 2019
ServerMonitor       pm2       ACTIVE            Thu Oct 17 05:14:20 2019        8759
DBRMWorkerNode      pm2       ACTIVE            Thu Oct 17 05:14:20 2019        8775
PrimProc            pm2       ACTIVE            Thu Oct 17 05:14:24 2019        8792
WriteEngineServer   pm2       ACTIVE            Thu Oct 17 05:14:25 2019        8806
 
Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0

[root@um1 jens]# mcsmysql test -e "CREATE TABLE tmp1 (i int) engine=columnstore;"
ERROR 1815 (HY000) at line 1: Internal error: CAL0009: Error occured when calling makeJobList



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2019-11-12 ]

Marked as duplicate of MCOL-3589. The "stop" command isn't bringing the module down properly so it comes up again in a bad state. Once the "stop" command works correctly the "start" command should also work again.

Comment by Andrew Hutchings (Inactive) [ 2019-11-20 ]

Reopened as Jens is still hitting this after the "stop" fix.

Comment by Andrew Hutchings (Inactive) [ 2019-11-21 ]

My notes: In the provided logs ProcMon on UM1 appears to get started twice on the "start" command. This causes a fight between the two and lots of process restarts.

Comment by Andrew Hutchings (Inactive) [ 2019-11-22 ]

the "columnstore" script will now make sure ProcMon and ProcMgr are shut down during 'stop' and also during 'start' if they have been left behind.

Comment by Jens Röwekamp (Inactive) [ 2019-11-25 ]

Hi LinuxJedi I tested the latest nightly RPMs (gitversionEngine: 6b91667) against MCOL-3564 and it seems that it still exists. After columnstore start the stopped module is still MAN_OFFLINE.

MCOL-3564 - 6b91667 - VirtualBox.tar

Comment by Jens Röwekamp (Inactive) [ 2019-11-26 ]

For future reference:
After executing "columnstore start" it is necessary to execute "mcsadmin startsystem" on PM1 to get ColumnStore up and running again.

"mcsadmin getsysteminfo" will confirm that ColumnStore is in an ACTIVE state.

Unfortunately DML on UM 1 just hangs.

ColumnStore support report and notes to reproduce on virtual machines attached.
1UM 2PM - PM1 columnstore stop - columnstore start - mcsadmin startsystem.tar

Comment by Daniel Lee (Inactive) [ 2019-11-27 ]

Build verified: 1.4.1-1
engine commit:
57724e5

UM1

[root@localhost bin]# ./columnstore stop
Stopping module um1 of MariaDB Columnstore Database Platform

WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
Access Console from 'pm1' if you need to make changes.

stopmodule Wed Nov 27 17:01:29 2019

Stopping Module(s)
Successful stop of Module(s)

On PM1

Password is actually correct.

First startsystem returned an error:

ERROR: Connection refused

mcsadmin> startsystem vagrant
startsystem Wed Nov 27 17:05:01 2019

startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules

System being started, please wait...ERROR: Connection refused

Invalid Password when running 'columnstore start' on module um1, can retry by providing password as the second argument

        • startSystem Failed
          ERROR: Connection refused

Invalid Password when running 'columnstore start' on module pm2, can retry by providing password as the second argument

        • startSystem Failed
          mcsadmin> startsystem vagrant
          startsystem Wed Nov 27 17:05:17 2019

startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules

System being started, please wait................
Successful start of System

Serious issue. inserted row lost

MariaDB [mytest]> create table t1 (c1 int) engine=columnstore;
Query OK, 0 rows affected (0.477 sec)

MariaDB [mytest]> insert into t1 values (1);
Query OK, 1 row affected (0.491 sec)

MariaDB [mytest]> select * from t1;
------

c1

------

1

------
1 row in set (0.160 sec)

At this point, stopped columnstore service on um1 and restarted stack from pm1

MariaDB [mytest]> select * from t1;
Empty set (0.190 sec)

MariaDB [mytest]>

Row is not missing

insert would hang

MariaDB [mytest]> insert into t1 values (2);

Comment by Andrew Hutchings (Inactive) [ 2019-11-27 ]

This behaviour is not supported in 1.4

Comment by Daniel Lee (Inactive) [ 2019-11-27 ]

mcsadmin> shutdownsystem
shutdownsystem Wed Nov 27 17:14:31 2019

This command stops the processing of applications on all Modules within the MariaDB ColumnStore System

Checking for active transactions
The following tables are locked:
LockID Name Process PID Session CreationTime State DBRoots
4 mytest.t1 DMLProc 26911 15 2019-11-27 05:12:59 PM Abandoned1 ,2
Your options are:
Cancel – Cancel the shutdown request
Wait – Wait for write operations to end and then shutdown
Force – Force a shutdown
What would you like to do: [Cancel]: force

Stopping System...

        • stopSystem Failed : check log files

Shutting Down System...
Successful shutdown of System

mcsadmin> startsystem vagrant
startsystem Wed Nov 27 17:28:28 2019

startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules

System being started, please wait.........................................

TIMEOUT: ProcMon not responding to getSystemStatus

        • startSystem Failed : check log files
Generated at Thu Feb 08 02:43:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.