[MCOL-4440] Multi-Node CS 5.4 with S3 storage failover gets stuck, requires API call to restart cluster Created: 2020-12-10  Updated: 2021-07-08  Resolved: 2021-05-21

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 5.4.3
Fix Version/s: 5.6.1

Type: Bug Priority: Major
Reporter: Daniel Almeida (Inactive) Assignee: Todd Stoffel (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
causes MCOL-4804 CMAPI needs to dynamically update IP ... Closed
Relates
relates to MCOL-4670 Primary Node Failover in a cluster wi... Closed
Sprint: 2021-1, 2021-2, 2021-3, 2021-4, 2021-5, 2021-6, 2021-7, 2021-8

 Description   

I experienced the following with ColumnStore multi-node environment with S3 storage, after a primary failure:

FYI, this environment was deployed using the following:
https://github.com/mariadb-corporation/columnstore-ansible

1. The primary node is shutdown intentionally
2. Replica took over as new primary
3. Waited a few minutes and I was able to select records from both new Primary and existing Replica (while the old primary has been shutdown)
4. Attempted to create a record and got stuck, got the following error after 10 minutes:

ERROR 1815 (HY000): Internal error: CAL0001: Insert Failed:   a BRM Set hwm error. [BRM error status: DBRM is in READ-ONLY mode]

5. Restarted the cluster via api
6. I was able to select and insert records

Full details and logs below

Environment:

Primary: lab-cs02
Replicas: lab-cs01 and lab-cs03

# BEGIN ANSIBLE MANAGED BLOCK
192.168.88.123 lab-cs01
192.168.88.124 lab-cs02
192.168.88.125 lab-cs03
192.168.88.126 lab-mx01

storage: s3

MariaDB [test]> show global variables like 'version';
+---------------+---------------------------------+
| Variable_name | Value                           |
+---------------+---------------------------------+
| version       | 10.5.6-4-MariaDB-enterprise-log |
+---------------+---------------------------------+
1 row in set (0.002 sec)
 
MariaDB [test]> show status like 'Column%version';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Columnstore_version | 5.4.3 |
+---------------------+-------+
1 row in set (0.002 sec)

1. Simulated a failed primary node by shutting down host lab-cs02

#############################################################
Logs from lab-cs01 showing it taking over as the new primary
#############################################################

######################################
/var/log/mariadb/columnstore/debug.log
######################################

Dec 10 14:23:31 lab-cs01 IDBFile[5737]: 31.915371 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:23:34 lab-cs01 StorageManager[5753]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:28 lab-cs01 StorageManager[5349]: Ownership: waiting to get data1
Dec 10 14:24:28 lab-cs01 controllernode[5737]: 28.313018 |0|0|0| C 29 CAL0000: ExtentMap::save(): open:  File exists
Dec 10 14:24:28 lab-cs01 StorageManager[5766]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:28 lab-cs01 StorageManager[5766]: Cache/cache_size = 8589934592
Dec 10 14:24:29 lab-cs01 StorageManager[5766]: S3Storage: S3 connectivity & permissions are OK
Dec 10 14:24:29 lab-cs01 StorageManager[5766]: max_concurrent_downloads = 21
Dec 10 14:24:29 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:29 lab-cs01 StorageManager[5803]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:29 lab-cs01 StorageManager[5803]: Cache/cache_size = 8589934592
Dec 10 14:24:29 lab-cs01 StorageManager[5803]: S3Storage: S3 connectivity & permissions are OK
Dec 10 14:24:29 lab-cs01 StorageManager[5803]: max_concurrent_downloads = 21
Dec 10 14:24:29 lab-cs01 StorageManager[5803]: max_concurrent_uploads = 21
Dec 10 14:24:29 lab-cs01 StorageManager[5803]: StorageManager started.
Dec 10 14:24:29 lab-cs01 StorageManager[5803]: SessionManager waiting for sockets.
Dec 10 14:24:30 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:30 lab-cs01 StorageManager[5818]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:30 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:31 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:31 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:32 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:32 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:33 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:33 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:34 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:34 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:35 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:35 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:36 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:36 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:37 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:37 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:38 lab-cs01 StorageManager[5766]: Ownership: waiting to get data1
Dec 10 14:24:38 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:39 lab-cs01 StorageManager[5766]: Ownership: taking ownership of data1
Dec 10 14:24:39 lab-cs01 StorageManager[5766]: max_concurrent_uploads = 21
Dec 10 14:24:39 lab-cs01 StorageManager[5766]: Ownership: releasing ownership of data1
Dec 10 14:24:39 lab-cs01 StorageManager[5803]: Ownership: waiting to get data1
Dec 10 14:24:39 lab-cs01 StorageManager[5803]: Ownership: taking ownership of data1
Dec 10 14:24:39 lab-cs01 StorageManager[5839]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:39 lab-cs01 StorageManager[5865]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:40 lab-cs01 StorageManager[5890]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:40 lab-cs01 StorageManager[5895]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:40 lab-cs01 StorageManager[5900]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:40 lab-cs01 StorageManager[5905]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:40 lab-cs01 StorageManager[5910]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:24:40 lab-cs01 IDBFile[5915]: 40.578151 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:24:42 lab-cs01 IDBFile[5919]: 42.697662 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:24:50 lab-cs01 IDBFile[5930]: 50.808902 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:24:50 lab-cs01 IDBFile[5937]: 50.882006 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:24:50 lab-cs01 controllernode[5930]: 50.892975 |0|0|0| D 29 CAL0000: DBRM Controller: Connected to DBRM_Worker1
Dec 10 14:24:50 lab-cs01 controllernode[5930]: 50.893857 |0|0|0| D 29 CAL0000: DBRM Controller: Connected to DBRM_Worker2
Dec 10 14:24:52 lab-cs01 controllernode[5930]: 52.974593 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:24:53 lab-cs01 IDBFile[5981]: 53.039008 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:24:53 lab-cs01 controllernode[5930]: 53.055978 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:24:55 lab-cs01 IDBFile[6021]: 55.096453 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:24:55 lab-cs01 DMLProc[6021]: 55.127056 |0|0|0| I 20 CAL0002: DMLProc starts rollbackAll.
Dec 10 14:24:55 lab-cs01 IDBFile[6024]: 55.142819 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:24:55 lab-cs01 DMLProc[6021]: 55.230228 |0|0|0| I 20 CAL0002: DMLProc will rollback 0 tables.
Dec 10 14:24:55 lab-cs01 DMLProc[6021]: 55.278898 |0|0|0| I 20 CAL0002: DMLProc finished rollbackAll.

###########################################
journalctl -u mariadb-columnstore-cmapi -f
###########################################

Dec 10 14:23:31 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:23:31] cmapi_server DEBUG    put_begin starts
Dec 10 14:23:31 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:23:31] cmapi_server DEBUG    put_begin JSON body {'id': 760796, 'timeout': 299}
Dec 10 14:23:31 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:23:31] cmapi_server DEBUG    put_begin returns {'timestamp': '2020-12-10 14:23:31.209538'}
Dec 10 14:23:31 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:23:31] "PUT /cmapi/0.4.0/node/begin HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  FA.startTransaction(): started transaction 760796
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  FA.deactivateNodes():  deactivating nodes: ['192.168.88.124']
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  FA.deactivateNodes(): deactivating node 192.168.88.124
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  Waiting for Maxscale to choose the new repl master...
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  The new replication master is lab-cs01
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  _move_primary_node(): dbroot 1 is assigned to ('lab-cs01', '192.168.88.123')
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  get_module_net_address Module 3 was not found.
Dec 10 14:23:31 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:23:31] cmapi_server DEBUG    put_config starts
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root   -- Matching against ModuleIPAddr2-1-3, which says 192.168.88.123
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  Wrote 'pm2' to /var/lib/columnstore/local/module
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  apply Running stop on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  stop running systemctl stop mcs-dmlproc
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  apply Running stop on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  stop running systemctl stop mcs-ddlproc
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  apply Running stop on mcs-primproc. With sudo False and is_primary True
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  stop running systemctl stop mcs-primproc
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  apply Running stop on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  stop running systemctl stop mcs-writeengineserver
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  apply Running stop on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  stop running systemctl stop mcs-exemgr
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  apply Running stop on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  stop running systemctl stop mcs-controllernode
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  apply Running stop on mcs-workernode. With sudo False and is_primary True
Dec 10 14:23:31 lab-cs01 python3[678]: [10/Dec/2020 14:23:31] root  stop running systemctl stop mcs-workernode@1 mcs-workernode@2
Dec 10 14:23:31 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:23:31] cmapi_server DEBUG    get_primary starts
Dec 10 14:23:31 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:23:31] cmapi_server DEBUG    get_primary returns {'is_primary': 'True'}
Dec 10 14:23:31 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:23:31] "GET /cmapi/0.4.0/node/primary HTTP/1.1" 200 22 "" "python-requests/2.23.0"
Dec 10 14:23:32 lab-cs01 python3[678]: [10/Dec/2020 14:23:32] root  Loaded the config file, my name is 192.168.88.123
Dec 10 14:23:33 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:23:33] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:23:33 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:23:33] cmapi_server DEBUG    get_brm_bytes returns em from S3.
Dec 10 14:23:33 lab-cs01 su[5752]: (to mysql) root on none
Dec 10 14:24:28 lab-cs01 python3[678]: Error reading data1/systemFiles/dbrm/BRM_saves_current: Success
Dec 10 14:24:28 lab-cs01 su[5761]: (to mysql) root on none
Dec 10 14:24:29 lab-cs01 python3[678]: [10/Dec/2020 14:24:29] root  apply Running stop on mcs-storagemanager. With sudo False and is_primary True
Dec 10 14:24:29 lab-cs01 python3[678]: [10/Dec/2020 14:24:29] root  stop running systemctl stop mcs-storagemanager
Dec 10 14:24:29 lab-cs01 python3[678]: [10/Dec/2020 14:24:29] root  apply Running start on mcs-workernode. With sudo False and is_primary True
Dec 10 14:24:29 lab-cs01 python3[678]: [10/Dec/2020 14:24:29] root  start running systemctl start mcs-workernode@1.service
Dec 10 14:24:30 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:30] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:24:30 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:30] cmapi_server DEBUG    get_brm_bytes returns em from S3.
Dec 10 14:24:30 lab-cs01 su[5817]: (to mysql) root on none
Dec 10 14:24:30 lab-cs01 python3[678]: [10/Dec/2020 14:24:30] root  Loaded the config file, my name is 192.168.88.123
Dec 10 14:24:39 lab-cs01 python3[678]: Error reading data1/systemFiles/dbrm/_em: No such file or directory
Dec 10 14:24:39 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:24:39] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:24:39 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020:14:24:39] "GET /cmapi/0.4.0/node/meta/em HTTP/1.1" 200 - "" "python-requests/2.23.0"
Dec 10 14:24:39 lab-cs01 su[5837]: (to mysql) root on none
Dec 10 14:24:39 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:39] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:24:39 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:24:39] "GET /cmapi/0.4.0/node/meta/em HTTP/1.1" 200 3564 "" "python-requests/2.23.0"
Dec 10 14:24:39 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:39] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:24:39 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:39] cmapi_server DEBUG    get_brm_bytes returns journal from S3.
Dec 10 14:24:39 lab-cs01 su[5843]: (to mysql) root on none
Dec 10 14:24:40 lab-cs01 su[5888]: (to mysql) root on none
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:40] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:24:40] "GET /cmapi/0.4.0/node/meta/journal HTTP/1.1" 200 9146 "" "python-requests/2.23.0"
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:40] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:40] cmapi_server DEBUG    get_brm_bytes returns vbbm from S3.
Dec 10 14:24:40 lab-cs01 su[5894]: (to mysql) root on none
Dec 10 14:24:40 lab-cs01 su[5898]: (to mysql) root on none
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:40] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:24:40] "GET /cmapi/0.4.0/node/meta/vbbm HTTP/1.1" 200 1716 "" "python-requests/2.23.0"
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:40] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:40] cmapi_server DEBUG    get_brm_bytes returns vss from S3.
Dec 10 14:24:40 lab-cs01 su[5904]: (to mysql) root on none
Dec 10 14:24:40 lab-cs01 su[5908]: (to mysql) root on none
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:24:40] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:24:40 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:24:40] "GET /cmapi/0.4.0/node/meta/vss HTTP/1.1" 200 2528 "" "python-requests/2.23.0"
Dec 10 14:24:42 lab-cs01 python3[678]: [10/Dec/2020 14:24:42] root  apply Waiting for all workernodes to come up before starting controllernode on the primary.
Dec 10 14:24:42 lab-cs01 python3[678]: [10/Dec/2020 14:24:42] root  apply Trying...
Dec 10 14:24:43 lab-cs01 python3[678]: [10/Dec/2020 14:24:43] root  apply Trying...
Dec 10 14:24:43 lab-cs01 python3[678]: [10/Dec/2020 14:24:43] root  apply Running start on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:24:43 lab-cs01 python3[678]: [10/Dec/2020 14:24:43] root  start running systemctl start mcs-controllernode
Dec 10 14:24:50 lab-cs01 python3[678]: [10/Dec/2020 14:24:50] root  apply Running start on mcs-primproc. With sudo False and is_primary True
Dec 10 14:24:50 lab-cs01 python3[678]: [10/Dec/2020 14:24:50] root  start running systemctl start mcs-primproc
Dec 10 14:24:52 lab-cs01 python3[678]: [10/Dec/2020 14:24:52] root  apply Running start on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:24:52 lab-cs01 python3[678]: [10/Dec/2020 14:24:52] root  start running systemctl start mcs-exemgr
Dec 10 14:24:52 lab-cs01 python3[678]: [10/Dec/2020 14:24:52] root  apply Running start on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:24:52 lab-cs01 python3[678]: [10/Dec/2020 14:24:52] root  start running systemctl start mcs-writeengineserver
Dec 10 14:24:52 lab-cs01 python3[678]: [10/Dec/2020 14:24:52] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:24:52 lab-cs01 python3[678]: [10/Dec/2020 14:24:52] root  apply Trying...
Dec 10 14:24:52 lab-cs01 python3[678]: [10/Dec/2020 14:24:52] root  apply Running start on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:24:52 lab-cs01 python3[678]: [10/Dec/2020 14:24:52] root  start running systemctl start mcs-dmlproc
Dec 10 14:24:53 lab-cs01 python3[678]: [10/Dec/2020 14:24:53] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:24:53 lab-cs01 python3[678]: [10/Dec/2020 14:24:53] root  apply Trying...
Dec 10 14:24:53 lab-cs01 python3[678]: [10/Dec/2020 14:24:53] root  apply Running start on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:24:53 lab-cs01 python3[678]: [10/Dec/2020 14:24:53] root  start running systemctl start mcs-ddlproc
Dec 10 14:24:53 lab-cs01 su[6014]: (to mysql) root on none
Dec 10 14:24:53 lab-cs01 python3[678]: Exiting, dbbuilder can only be run on the Active Parent OAM Module
Dec 10 14:24:53 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:24:53] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:24:54 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:24:54] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:24:55 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:24:55] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:24:56 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:24:56] cmapi_server DEBUG    node is ready to accept queries
Dec 10 14:24:56 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:24:56] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:24:56 lab-cs01 python3[678]: [10/Dec/2020 14:24:56] root  broadcast_new_config(): successfully pushed the new config file to ['192.168.88.125', '192.168.88.123']
Dec 10 14:24:56 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:24:56] cmapi_server DEBUG    put_commit starts
Dec 10 14:24:56 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:24:56] cmapi_server DEBUG    put_commit returns {'timestamp': '2020-12-10 14:24:56.285365'}
Dec 10 14:24:56 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:24:56] "PUT /cmapi/0.4.0/node/commit HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:24:56 lab-cs01 python3[678]: [10/Dec/2020 14:24:56] root  FA.commitTransaction(): committed transaction 760796

2. Once the logs showed replica took over as new primary, I attempted to run a select query on the new Primary host

#########################
Select works as expected
#########################

MariaDB [test]> select * from t1;
+----------+---------------------+
| hostName | dateCreated         |
+----------+---------------------+
| lab-cs02 | 2020-12-10 14:21:03 |
+----------+---------------------+
1 row in set (10.395 sec)
 
MariaDB [test]> select * from t1;
+----------+---------------------+
| hostName | dateCreated         |
+----------+---------------------+
| lab-cs02 | 2020-12-10 14:21:03 |
+----------+---------------------+
1 row in set (0.023 sec)

######################################
/var/log/mariadb/columnstore/debug.log
######################################

Dec 10 14:26:22 lab-cs01 ExeMgr[5969]: 22.865470 |2181038095|0|0| D 16 CAL0041: Start SQL statement: select objectid,columnname from syscolumn where schema='test' and tablename='t1' --columnRIDs/FE; ||
Dec 10 14:26:23 lab-cs01 ExeMgr[5969]: 23.031321 |2181038095|0|0| D 16 CAL0042: End SQL statement
Dec 10 14:26:23 lab-cs01 ExeMgr[5969]: 23.085084 |33554447|0|0| D 16 CAL0041: Start SQL statement: select * from t1; |test|
Dec 10 14:26:23 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:24 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:25 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:26 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:27 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:28 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:29 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:30 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:31 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:32 lab-cs01 StorageManager[5803]: Ownership: waiting to get data2
Dec 10 14:26:33 lab-cs01 StorageManager[5803]: Ownership: taking ownership of data2
Dec 10 14:26:33 lab-cs01 ExeMgr[5969]: 33.225912 |33554447|0|0| D 16 CAL0042: End SQL statement
Dec 10 14:27:19 lab-cs01 ExeMgr[5969]: 19.489645 |2181038114|0|0| D 16 CAL0041: Start SQL statement: select objectid,columnname from syscolumn where schema='test' and tablename='t1' --columnRIDs/FE; ||
Dec 10 14:27:19 lab-cs01 ExeMgr[5969]: 19.517602 |2181038114|0|0| D 16 CAL0042: End SQL statement
Dec 10 14:27:19 lab-cs01 ExeMgr[5969]: 19.537583 |33554466|0|0| D 16 CAL0041: Start SQL statement: select * from t1; |test|
Dec 10 14:27:19 lab-cs01 ExeMgr[5969]: 19.542277 |33554466|0|0| D 16 CAL0042: End SQL statement

4. Attempted to create record and got stuck

MariaDB [test]> insert into t1 (hostName) values (@@hostname);
ERROR 1815 (HY000): Internal error: CAL0001: Insert Failed:   a BRM Set hwm error. [BRM error status: DBRM is in READ-ONLY mode]

5. Waited over 10 minutes and got the errors below

######################################
/var/log/mariadb/columnstore/debug.log
######################################

Dec 10 14:27:28 lab-cs01 dmlpackageproc[6021]: 28.717919 |33554447|14|0| D 21 CAL0001: Start SQL statement:  insert into t1 (hostName) values (@@hostname); |test|
Dec 10 14:27:29 lab-cs01 controllernode[5930]: 29.007058 |0|0|0| C 29 CAL0000: DBRM Controller: image inconsistency detected at node 2.  Verifying response to command 29, length 15
Dec 10 14:32:19 lab-cs01 controllernode[5930]: 19.215197 |0|0|0| C 29 CAL0000: DBRM Controller: Network error reading from node 1.  Reading response to command 29, length 15.  0 length response, possible time-out.  Will see if retry is possible.
Dec 10 14:37:18 lab-cs01 controllernode[5930]: 18.404727 |0|0|0| C 29 CAL0000: A node is unresponsive for cmd = 29, no reconfigure in at least 300 seconds.  Setting read-only mode.
Dec 10 14:37:18 lab-cs01 dmlpackageproc[6021]: 18.485784 |33554447|14|0| D 21 CAL0001: End SQL statement
Dec 10 14:37:18 lab-cs01 dmlpackageproc[6021]: 18.485860 |0|0|0| E 21 CAL0001: Insert Failed:   a BRM Set hwm error. [BRM error status: DBRM is in READ-ONLY mode]

6. restarted the cluster via api

curl -s -X PUT https://$(hostname):8640/cmapi/0.4.0/cluster/start --header 'Content-Type:application/json' --header 'x-api-key:8c96f1852c2c049cde9e266e96d98e3c32c1dcee61ba1fda03aa40e077e8927e' --data '{"timeout":60}' -k | jq .

######################################
/var/log/mariadb/columnstore/debug.log
######################################

Dec 10 14:42:45 lab-cs01 StorageManager[6627]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:42:45 lab-cs01 StorageManager[6633]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:42:45 lab-cs01 StorageManager[6638]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:42:45 lab-cs01 StorageManager[6643]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:42:46 lab-cs01 StorageManager[6648]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:42:46 lab-cs01 StorageManager[6653]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:42:46 lab-cs01 StorageManager[6658]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:42:46 lab-cs01 StorageManager[6663]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:44 lab-cs01 StorageManager[5803]: SessionManager Caught Signal 15
Dec 10 14:46:44 lab-cs01 StorageManager[5803]: Shutdown StorageManager...
Dec 10 14:46:44 lab-cs01 StorageManager[5803]: Ownership: releasing ownership of data1
Dec 10 14:46:44 lab-cs01 StorageManager[5803]: Ownership: releasing ownership of data2
Dec 10 14:46:44 lab-cs01 StorageManager[5803]: StorageManager Shutdown Complete.
Dec 10 14:46:45 lab-cs01 StorageManager[6998]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:45 lab-cs01 StorageManager[6998]: Cache/cache_size = 8589934592
Dec 10 14:46:45 lab-cs01 StorageManager[6998]: S3Storage: S3 connectivity & permissions are OK
Dec 10 14:46:45 lab-cs01 StorageManager[6998]: max_concurrent_downloads = 21
Dec 10 14:46:45 lab-cs01 StorageManager[6998]: max_concurrent_uploads = 21
Dec 10 14:46:45 lab-cs01 StorageManager[6998]: StorageManager started.
Dec 10 14:46:45 lab-cs01 StorageManager[6998]: SessionManager waiting for sockets.
Dec 10 14:46:46 lab-cs01 StorageManager[7013]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:46 lab-cs01 StorageManager[6998]: Ownership: taking ownership of data1
Dec 10 14:46:46 lab-cs01 StorageManager[7019]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:46 lab-cs01 StorageManager[7037]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:46 lab-cs01 StorageManager[7042]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:46 lab-cs01 StorageManager[7047]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:46 lab-cs01 StorageManager[7052]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:46 lab-cs01 StorageManager[7057]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:46 lab-cs01 StorageManager[7062]: Using the config file found at /etc/columnstore/storagemanager.cnf
Dec 10 14:46:46 lab-cs01 IDBFile[7067]: 46.778988 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:46:48 lab-cs01 IDBFile[7071]: 48.933349 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:46:57 lab-cs01 IDBFile[7134]: 57.120297 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:46:57 lab-cs01 controllernode[7134]: 57.205932 |0|0|0| D 29 CAL0000: DBRM Controller: Connected to DBRM_Worker1
Dec 10 14:46:57 lab-cs01 controllernode[7134]: 57.206819 |0|0|0| D 29 CAL0000: DBRM Controller: Connected to DBRM_Worker2
Dec 10 14:46:57 lab-cs01 IDBFile[7153]: 57.250964 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.556260 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.591219 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.637517 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.670380 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 IDBFile[7221]: 59.707603 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.716160 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.786093 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.815136 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.870182 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.907199 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:46:59 lab-cs01 controllernode[7134]: 59.921446 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
Dec 10 14:47:01 lab-cs01 IDBFile[7326]: 01.773983 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:47:01 lab-cs01 DMLProc[7326]: 01.790286 |0|0|0| I 20 CAL0002: DMLProc starts rollbackAll.
Dec 10 14:47:01 lab-cs01 DMLProc[7326]: 01.860506 |0|0|0| I 20 CAL0002: DMLProc will rollback 1 tables.
Dec 10 14:47:01 lab-cs01 DMLProc[7326]: 01.860578 |0|0|0| I 20 CAL0002: DMLProc is rolling back transaction 14
Dec 10 14:47:01 lab-cs01 IDBFile[7349]: 01.900202 |0|0|0| D 35 CAL0002: IDBFactory::installPlugin: installed filesystem plugin libcloudio.so
Dec 10 14:47:02 lab-cs01 StorageManager[6998]: Ownership: taking ownership of data2
Dec 10 14:47:02 lab-cs01 DMLProc[7326]: 02.190167 |0|0|0| I 20 CAL0002: DMLProc rolled back transaction 14 and is releasing table lock id 1
Dec 10 14:47:02 lab-cs01 DMLProc[7326]: 02.218004 |0|0|0| I 20 CAL0002: DMLProc rolled back transaction 14 and table lock id 1 is released.
Dec 10 14:47:02 lab-cs01 DMLProc[7326]: 02.243662 |0|0|0| I 20 CAL0002: DMLProc finished rollbackAll.

###########################################
journalctl -u mariadb-columnstore-cmapi -f
###########################################

Dec 10 14:38:42 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:38:42] cmapi_server DEBUG    put_start starts
Dec 10 14:38:42 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:38:42] cmapi_server DEBUG    put_begin starts
Dec 10 14:38:42 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:38:42] cmapi_server DEBUG    put_begin JSON body {'id': 874308, 'timeout': 299}
Dec 10 14:38:42 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:38:42] cmapi_server DEBUG    put_begin returns {'timestamp': '2020-12-10 14:38:42.998902'}
Dec 10 14:38:42 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:38:42] "PUT /cmapi/0.4.0/node/begin HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:38:43 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:38:43] cmapi_server DEBUG    put_config starts
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root   -- Matching against ModuleIPAddr2-1-3, which says 192.168.88.123
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  Wrote 'pm2' to /var/lib/columnstore/local/module
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  apply Running stop on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  stop running systemctl stop mcs-dmlproc
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  apply Running stop on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  stop running systemctl stop mcs-ddlproc
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  apply Running stop on mcs-primproc. With sudo False and is_primary True
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  stop running systemctl stop mcs-primproc
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  apply Running stop on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  stop running systemctl stop mcs-writeengineserver
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  apply Running stop on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  stop running systemctl stop mcs-exemgr
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  apply Running stop on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  stop running systemctl stop mcs-controllernode
Dec 10 14:38:43 lab-cs01 python3[678]: [10/Dec/2020 14:38:43] root  Loaded the config file, my name is 192.168.88.123
Dec 10 14:38:55 lab-cs01 python3[678]: [10/Dec/2020 14:38:55] root  apply Running stop on mcs-workernode. With sudo False and is_primary True
Dec 10 14:38:55 lab-cs01 python3[678]: [10/Dec/2020 14:38:55] root  stop running systemctl stop mcs-workernode@1 mcs-workernode@2
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  broadcast_new_config(): timeout on node 192.168.88.123
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  broadcast_new_config(): timeout on node 192.168.88.125
Dec 10 14:40:43 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:40:43] cmapi_server DEBUG    put_config starts
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root   -- Matching against ModuleIPAddr2-1-3, which says 192.168.88.123
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  Wrote 'pm2' to /var/lib/columnstore/local/module
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  apply Running stop on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  stop running systemctl stop mcs-dmlproc
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  apply Running stop on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  stop running systemctl stop mcs-ddlproc
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  apply Running stop on mcs-primproc. With sudo False and is_primary True
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  stop running systemctl stop mcs-primproc
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  apply Running stop on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  stop running systemctl stop mcs-writeengineserver
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  apply Running stop on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  stop running systemctl stop mcs-exemgr
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  apply Running stop on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  stop running systemctl stop mcs-controllernode
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  apply Running stop on mcs-workernode. With sudo False and is_primary True
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  stop running systemctl stop mcs-workernode@1 mcs-workernode@2
Dec 10 14:40:43 lab-cs01 python3[678]: [10/Dec/2020 14:40:43] root  Loaded the config file, my name is 192.168.88.123
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  broadcast_new_config(): timeout on node 192.168.88.123
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  broadcast_new_config(): timeout on node 192.168.88.125
Dec 10 14:42:43 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:42:43] cmapi_server DEBUG    put_config starts
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root   -- Matching against ModuleIPAddr2-1-3, which says 192.168.88.123
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  Wrote 'pm2' to /var/lib/columnstore/local/module
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  apply Running stop on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  stop running systemctl stop mcs-dmlproc
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  apply Running stop on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  stop running systemctl stop mcs-ddlproc
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  apply Running stop on mcs-primproc. With sudo False and is_primary True
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  stop running systemctl stop mcs-primproc
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  apply Running stop on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  stop running systemctl stop mcs-writeengineserver
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  apply Running stop on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  stop running systemctl stop mcs-exemgr
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  apply Running stop on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  stop running systemctl stop mcs-controllernode
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  apply Running stop on mcs-workernode. With sudo False and is_primary True
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  stop running systemctl stop mcs-workernode@1 mcs-workernode@2
Dec 10 14:42:43 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:42:43] cmapi_server DEBUG    get_primary starts
Dec 10 14:42:43 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:42:43] cmapi_server DEBUG    get_primary returns {'is_primary': 'True'}
Dec 10 14:42:43 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:42:43] "GET /cmapi/0.4.0/node/primary HTTP/1.1" 200 22 "" "python-requests/2.23.0"
Dec 10 14:42:43 lab-cs01 python3[678]: [10/Dec/2020 14:42:43] root  Loaded the config file, my name is 192.168.88.123
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:45] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:45] cmapi_server DEBUG    get_brm_bytes returns em from S3.
Dec 10 14:42:45 lab-cs01 su[6626]: (to mysql) root on none
Dec 10 14:42:45 lab-cs01 su[6631]: (to mysql) root on none
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:45] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020:14:42:45] "GET /cmapi/0.4.0/node/meta/em HTTP/1.1" 200 3564 "" "python-requests/2.23.0"
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:45] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:45] cmapi_server DEBUG    get_brm_bytes returns journal from S3.
Dec 10 14:42:45 lab-cs01 su[6637]: (to mysql) root on none
Dec 10 14:42:45 lab-cs01 su[6641]: (to mysql) root on none
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:45] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020:14:42:45] "GET /cmapi/0.4.0/node/meta/journal HTTP/1.1" 200 9268 "" "python-requests/2.23.0"
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:45] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:42:45 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:45] cmapi_server DEBUG    get_brm_bytes returns vbbm from S3.
Dec 10 14:42:45 lab-cs01 su[6647]: (to mysql) root on none
Dec 10 14:42:46 lab-cs01 su[6651]: (to mysql) root on none
Dec 10 14:42:46 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:46] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:42:46 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020:14:42:46] "GET /cmapi/0.4.0/node/meta/vbbm HTTP/1.1" 200 1716 "" "python-requests/2.23.0"
Dec 10 14:42:46 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:46] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:42:46 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:46] cmapi_server DEBUG    get_brm_bytes returns vss from S3.
Dec 10 14:42:46 lab-cs01 su[6657]: (to mysql) root on none
Dec 10 14:42:46 lab-cs01 su[6661]: (to mysql) root on none
Dec 10 14:42:46 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020 14:42:46] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:42:46 lab-cs01 python3[678]: 192.168.88.125 - - [10/Dec/2020:14:42:46] "GET /cmapi/0.4.0/node/meta/vss HTTP/1.1" 200 2528 "" "python-requests/2.23.0"
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  broadcast_new_config(): timeout on node 192.168.88.123
Dec 10 14:44:43 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:44:43] cmapi_server DEBUG    put_config starts
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root   -- Matching against ModuleIPAddr2-1-3, which says 192.168.88.123
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  Wrote 'pm2' to /var/lib/columnstore/local/module
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  apply Running stop on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  stop running systemctl stop mcs-dmlproc
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  apply Running stop on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  stop running systemctl stop mcs-ddlproc
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  apply Running stop on mcs-primproc. With sudo False and is_primary True
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  stop running systemctl stop mcs-primproc
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  apply Running stop on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  stop running systemctl stop mcs-writeengineserver
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  apply Running stop on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  stop running systemctl stop mcs-exemgr
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  apply Running stop on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  stop running systemctl stop mcs-controllernode
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  apply Running stop on mcs-workernode. With sudo False and is_primary True
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  stop running systemctl stop mcs-workernode@1 mcs-workernode@2
Dec 10 14:44:43 lab-cs01 python3[678]: [10/Dec/2020 14:44:43] root  Loaded the config file, my name is 192.168.88.123
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  broadcast_new_config(): timeout on node 192.168.88.123
Dec 10 14:46:43 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:46:43] cmapi_server DEBUG    put_config starts
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root   -- Matching against ModuleIPAddr2-1-3, which says 192.168.88.123
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  Wrote 'pm2' to /var/lib/columnstore/local/module
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  apply Running stop on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  stop running systemctl stop mcs-dmlproc
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  apply Running stop on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  stop running systemctl stop mcs-ddlproc
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  apply Running stop on mcs-primproc. With sudo False and is_primary True
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  stop running systemctl stop mcs-primproc
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  apply Running stop on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  stop running systemctl stop mcs-writeengineserver
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  apply Running stop on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  stop running systemctl stop mcs-exemgr
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  apply Running stop on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  stop running systemctl stop mcs-controllernode
Dec 10 14:46:43 lab-cs01 python3[678]: [10/Dec/2020 14:46:43] root  apply Running stop on mcs-workernode. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  stop running systemctl stop mcs-workernode@1 mcs-workernode@2
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  Loaded the config file, my name is 192.168.88.123
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running stop on mcs-storagemanager. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  stop running systemctl stop mcs-storagemanager
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running stop on mcs-storagemanager. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  stop running systemctl stop mcs-storagemanager
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running stop on mcs-storagemanager. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  stop running systemctl stop mcs-storagemanager
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running stop on mcs-storagemanager. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  stop running systemctl stop mcs-storagemanager
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running stop on mcs-storagemanager. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  stop running systemctl stop mcs-storagemanager
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running start on mcs-workernode. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running start on mcs-workernode. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running start on mcs-workernode. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running start on mcs-workernode. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  apply Running start on mcs-workernode. With sudo False and is_primary True
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  start running systemctl start mcs-workernode@1.service
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  start running systemctl start mcs-workernode@1.service
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  start running systemctl start mcs-workernode@1.service
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  start running systemctl start mcs-workernode@1.service
Dec 10 14:46:44 lab-cs01 python3[678]: [10/Dec/2020 14:46:44] root  start running systemctl start mcs-workernode@1.service
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes returns em from S3.
Dec 10 14:46:46 lab-cs01 su[7012]: (to mysql) root on none
Dec 10 14:46:46 lab-cs01 su[7017]: (to mysql) root on none
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:46:46] "GET /cmapi/0.4.0/node/meta/em HTTP/1.1" 200 3564 "" "python-requests/2.23.0"
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes returns journal from S3.
Dec 10 14:46:46 lab-cs01 su[7024]: (to mysql) root on none
Dec 10 14:46:46 lab-cs01 python3[678]: [10/Dec/2020 14:46:46] root  Loaded the config file, my name is 192.168.88.123
Dec 10 14:46:46 lab-cs01 su[7040]: (to mysql) root on none
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:46:46] "GET /cmapi/0.4.0/node/meta/journal HTTP/1.1" 200 9268 "" "python-requests/2.23.0"
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes returns vbbm from S3.
Dec 10 14:46:46 lab-cs01 su[7046]: (to mysql) root on none
Dec 10 14:46:46 lab-cs01 su[7050]: (to mysql) root on none
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:46:46] "GET /cmapi/0.4.0/node/meta/vbbm HTTP/1.1" 200 1716 "" "python-requests/2.23.0"
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes starts
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes returns vss from S3.
Dec 10 14:46:46 lab-cs01 su[7056]: (to mysql) root on none
Dec 10 14:46:46 lab-cs01 su[7060]: (to mysql) root on none
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:46:46] cmapi_server DEBUG    get_brm_bytes returns.
Dec 10 14:46:46 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:46:46] "GET /cmapi/0.4.0/node/meta/vss HTTP/1.1" 200 2528 "" "python-requests/2.23.0"
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Waiting for all workernodes to come up before starting controllernode on the primary.
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Trying...
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Waiting for all workernodes to come up before starting controllernode on the primary.
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Trying...
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Waiting for all workernodes to come up before starting controllernode on the primary.
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Trying...
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Waiting for all workernodes to come up before starting controllernode on the primary.
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Trying...
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Waiting for all workernodes to come up before starting controllernode on the primary.
Dec 10 14:46:48 lab-cs01 python3[678]: [10/Dec/2020 14:46:48] root  apply Trying...
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Trying...
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Running start on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Trying...
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Running start on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Trying...
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Running start on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Trying...
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Running start on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Trying...
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  apply Running start on mcs-controllernode. With sudo False and is_primary True
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  start running systemctl start mcs-controllernode
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  start running systemctl start mcs-controllernode
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  start running systemctl start mcs-controllernode
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  start running systemctl start mcs-controllernode
Dec 10 14:46:49 lab-cs01 python3[678]: [10/Dec/2020 14:46:49] root  start running systemctl start mcs-controllernode
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  apply Running start on mcs-primproc. With sudo False and is_primary True
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  apply Running start on mcs-primproc. With sudo False and is_primary True
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  apply Running start on mcs-primproc. With sudo False and is_primary True
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  apply Running start on mcs-primproc. With sudo False and is_primary True
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  apply Running start on mcs-primproc. With sudo False and is_primary True
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  start running systemctl start mcs-primproc
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  start running systemctl start mcs-primproc
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  start running systemctl start mcs-primproc
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  start running systemctl start mcs-primproc
Dec 10 14:46:57 lab-cs01 python3[678]: [10/Dec/2020 14:46:57] root  start running systemctl start mcs-primproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-exemgr. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-exemgr
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-exemgr
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-exemgr
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-exemgr
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-exemgr
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-writeengineserver. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-writeengineserver
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-writeengineserver
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-writeengineserver
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-writeengineserver
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-writeengineserver
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-dmlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-dmlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-dmlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-dmlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-dmlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-dmlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Trying...
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  apply Running start on mcs-ddlproc. With sudo False and is_primary True
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-ddlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-ddlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-ddlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-ddlproc
Dec 10 14:46:59 lab-cs01 python3[678]: [10/Dec/2020 14:46:59] root  start running systemctl start mcs-ddlproc
Dec 10 14:46:59 lab-cs01 su[7287]: (to mysql) root on none
Dec 10 14:46:59 lab-cs01 su[7289]: (to mysql) root on none
Dec 10 14:46:59 lab-cs01 su[7290]: (to mysql) root on none
Dec 10 14:46:59 lab-cs01 su[7291]: (to mysql) root on none
Dec 10 14:46:59 lab-cs01 su[7292]: (to mysql) root on none
Dec 10 14:47:00 lab-cs01 python3[678]: Exiting, dbbuilder can only be run on the Active Parent OAM Module
Dec 10 14:47:00 lab-cs01 python3[678]: Exiting, dbbuilder can only be run on the Active Parent OAM Module
Dec 10 14:47:00 lab-cs01 python3[678]: Exiting, dbbuilder can only be run on the Active Parent OAM Module
Dec 10 14:47:00 lab-cs01 python3[678]: Exiting, dbbuilder can only be run on the Active Parent OAM Module
Dec 10 14:47:00 lab-cs01 python3[678]: Exiting, dbbuilder can only be run on the Active Parent OAM Module
Dec 10 14:47:00 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:00] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:00 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:00] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:00 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:00] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:00 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:00] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:00 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:00] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:01 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:01] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:01 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:01] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:01 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:01] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:01 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:01] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:01 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:01] cmapi_server DEBUG    node is not ready to accept queries
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:02] cmapi_server DEBUG    node is ready to accept queries
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:47:02] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:02] cmapi_server DEBUG    node is ready to accept queries
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:47:02] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:02] cmapi_server DEBUG    node is ready to accept queries
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:47:02] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:02] cmapi_server DEBUG    node is ready to accept queries
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:47:02] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:02] cmapi_server DEBUG    node is ready to accept queries
Dec 10 14:47:02 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:47:02] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:47:02 lab-cs01 python3[678]: [10/Dec/2020 14:47:02] root  broadcast_new_config(): successfully pushed the new config file to ['192.168.88.125', '192.168.88.123']
Dec 10 14:47:02 lab-cs01 python3[678]: [10/Dec/2020 14:47:02] root  commit_transaction(): got error during request to 192.168.88.125: 422 Client Error: Unprocessable Entity for url: https://192.168.88.125:8640/cmapi/0.4.0/node/commit
Dec 10 14:47:03 lab-cs01 python3[678]: [10/Dec/2020 14:47:03] root  commit_transaction(): got error during request to 192.168.88.125: 422 Client Error: Unprocessable Entity for url: https://192.168.88.125:8640/cmapi/0.4.0/node/commit
Dec 10 14:47:04 lab-cs01 python3[678]: [10/Dec/2020 14:47:04] root  commit_transaction(): got error during request to 192.168.88.125: 422 Client Error: Unprocessable Entity for url: https://192.168.88.125:8640/cmapi/0.4.0/node/commit
Dec 10 14:47:05 lab-cs01 python3[678]: [10/Dec/2020 14:47:05] root  commit_transaction(): got error during request to 192.168.88.125: 422 Client Error: Unprocessable Entity for url: https://192.168.88.125:8640/cmapi/0.4.0/node/commit
Dec 10 14:47:06 lab-cs01 python3[678]: [10/Dec/2020 14:47:06] root  commit_transaction(): got error during request to 192.168.88.125: 422 Client Error: Unprocessable Entity for url: https://192.168.88.125:8640/cmapi/0.4.0/node/commit
Dec 10 14:47:07 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:07] cmapi_server DEBUG    put_commit starts
Dec 10 14:47:07 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020 14:47:07] cmapi_server DEBUG    put_commit returns {'timestamp': '2020-12-10 14:47:07.470912'}
Dec 10 14:47:07 lab-cs01 python3[678]: 192.168.88.123 - - [10/Dec/2020:14:47:07] "PUT /cmapi/0.4.0/node/commit HTTP/1.1" 200 43 "" "python-requests/2.23.0"
Dec 10 14:47:07 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020 14:47:07] cmapi_server DEBUG    put_start returns {'timestamp': '2020-12-10 14:38:42.990832'}
Dec 10 14:47:07 lab-cs01 python3[678]: 127.0.0.1 - - [10/Dec/2020:14:47:07] "PUT /cmapi/0.4.0/cluster/start HTTP/1.1" 200 43 "" "curl/7.29.0"

7. Insert works

MariaDB [test]> insert into t1 (hostName) values (@@hostname);
Query OK, 1 row affected (0.627 sec)

######################################
/var/log/mariadb/columnstore/debug.log
######################################

Dec 10 14:47:47 lab-cs01 ExeMgr[7197]: 47.198674 |2181038111|0|0| D 16 CAL0041: Start SQL statement: select objectid,columnname from syscolumn where schema='test' and tablename='t1' --columnRIDs/FE; ||
Dec 10 14:47:47 lab-cs01 ExeMgr[7197]: 47.278329 |2181038111|0|0| D 16 CAL0042: End SQL statement
Dec 10 14:47:47 lab-cs01 ExeMgr[7197]: 47.311101 |33554463|0|0| D 16 CAL0041: Start SQL statement: select * from t1; |test|
Dec 10 14:47:47 lab-cs01 ExeMgr[7197]: 47.343943 |33554463|0|0| D 16 CAL0042: End SQL statement
Dec 10 14:47:50 lab-cs01 ExeMgr[7197]: 50.897339 |33554463|0|0| D 16 CAL0041: Start SQL statement: select * from t1; |test|
Dec 10 14:47:50 lab-cs01 ExeMgr[7197]: 50.909521 |33554463|0|0| D 16 CAL0042: End SQL statement
Dec 10 14:47:51 lab-cs01 ExeMgr[7197]: 51.850326 |33554463|0|0| D 16 CAL0041: Start SQL statement: select * from t1; |test|
Dec 10 14:47:51 lab-cs01 ExeMgr[7197]: 51.856756 |33554463|0|0| D 16 CAL0042: End SQL statement
Dec 10 14:47:56 lab-cs01 dmlpackageproc[7326]: 56.084927 |33554463|15|0| D 21 CAL0001: Start SQL statement:  insert into t1 (hostName) values (@@hostname); |test|
Dec 10 14:47:56 lab-cs01 dmlpackageproc[7326]: 56.546193 |33554463|15|0| D 21 CAL0001: End SQL statement
Dec 10 14:47:56 lab-cs01 dmlpackageproc[7326]: 56.557896 |33554463|15|0| D 21 CAL0001: Start SQL statement:  COMMIT
Dec 10 14:47:56 lab-cs01 dmlpackageproc[7326]: 56.630457 |33554463|15|0| D 21 CAL0001: End SQL statement

################
Verified records
################

MariaDB [test]> select * from t1;
+----------+---------------------+
| hostName | dateCreated         |
+----------+---------------------+
| lab-cs02 | 2020-12-10 14:21:03 |
| lab-cs01 | 2020-12-10 14:47:56 |
+----------+---------------------+
2 rows in set (0.073 sec)

######################################
/var/log/mariadb/columnstore/debug.log
######################################

Dec 10 14:48:25 lab-cs01 ExeMgr[7197]: 25.769512 |33554463|0|0| D 16 CAL0041: Start SQL statement: select * from t1; |test|
Dec 10 14:48:25 lab-cs01 ExeMgr[7197]: 25.830707 |33554463|0|0| D 16 CAL0042: End SQL statement



 Comments   
Comment by Jose Rojas (Inactive) [ 2021-04-06 ]

A few issues arise after a primary node failover.

There is a race condition occurring where a non-primary occasionally attempts to pull metadata from primary node before the primary node has started its storagemanger process, creating loadbrm failure.

Another is the plugin code is still connected to old primary node. The changes in the PR also addresses this, destroying old connection and creating new connection to new primary.

There is a third issue w/ primary node failover that happens periodically, and is described in https://jira.mariadb.org/browse/MCOL-4670

Comment by Roman [ 2021-05-05 ]

The issue caused by the fact that the primary might rewrite BRM files whilst CMAPI on the primary serves the files to other non-primary nodes.
I changed the script that is used to put BRM files: ExtenMap, VSS, VBBM into shared memory so that the script doesn't download or change BRM files on primary.

Comment by Daniel Lee (Inactive) [ 2021-05-12 ]

Reopen per my last test result

Comment by Roman [ 2021-05-13 ]

I will break my comment on your results in two parts:
1. It is important to test S3 installation and according with what I saw in the debug.log the cluster doesn't use S3.
2. 401 Error usually tells that either CMAPI auth key is empty or wrong. CMAPI key in a fresh installation is saved into /etc/columnstore/cmapi_server.cnf when one adds a node. Could you double check what is the value of $MCSAPIKEY? /var/log/mariadb/cmapi_server.log might give clues on what is the key value was when you hit 401.
3. That is correct that by default there is no async replication set up b/w cluster nodes that is yet needed to sync MDB nodes across the cluster. So you either need to setup async replication and use maxscale that automaticly detect the cluster master or just ignore this part and re-create the table on a new master with comment='schema_sync_only'.

Comment by Daniel Lee (Inactive) [ 2021-05-14 ]

Regarding #2 above. A change in my automated installation tool did cause the MCSAPIKEY to be different among nodes. I have that issue fixed.

Generated at Thu Feb 08 02:50:22 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.