[MCOL-4301] cmapi: Active nodes should not be removable Created: 2020-09-08  Updated: 2021-02-20  Resolved: 2021-02-20

Status: Closed
Project: MariaDB ColumnStore
Component/s: cmapi
Affects Version/s: 1.0.0
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Daniel Lee (Inactive) Assignee: Unassigned
Resolution: Not a Bug Votes: 0
Labels: None


 Description   

Build tested: 1.5.4-1 (Drone #587), cmapi (Drone #251)

With a 3-node cluster up and running and data spread over on all three local dbroots, I was able to remove a node (PM3) along with its dbroot.

A query on existing table returned an error.

MariaDB [tpch102]> select count from lineitem;
ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information.

Active node and data should not be easily removed.



 Comments   
Comment by Daniel Lee (Inactive) [ 2020-09-08 ]

I added back the removed node, The system catalog (systable and syscolumn tables in calpontsys) is now empty and querying any previously existed data would return an error:

MariaDB [mytest]> select * from quicktest;
ERROR 1815 (HY000): Internal error: IDB-2006: 'mytest.quicktest' does not exist in Columnstore.
MariaDB [mytest]> exit

Further research showed that the newly added node (PM3) is now the master, PM1( use to be the master) is now a slave.

[centos7:root~]# curl -k -s https://s2pm1:8640/cmapi/0.4.0/cluster/status \
> --header 'Content-Type:application/json' \
> --header "x-api-key:$MCSAPIKEY" \
> | jq .
{
"timestamp": "2020-09-08 22:31:39.316098",
"s2pm1": {
"timestamp": "2020-09-08 22:31:39.323519",
"uptime": 14578,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [],
"module_id": 1,
"services": [

{ "name": "workernode", "pid": 23771 }

,

{ "name": "PrimProc", "pid": 23780 }

,

{ "name": "ExeMgr", "pid": 23828 }

,

{ "name": "WriteEngine", "pid": 23837 }

]
},
"s2pm2": {
"timestamp": "2020-09-08 22:31:39.384584",
"uptime": 14722,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [],
"module_id": 2,
"services": [

{ "name": "workernode", "pid": 18699 }

,

{ "name": "PrimProc", "pid": 18708 }

,

{ "name": "ExeMgr", "pid": 18757 }

,

{ "name": "WriteEngine", "pid": 18766 }

]
},
"s1pm1": {
"timestamp": "2020-09-08 22:31:39.379456",
"uptime": 9887,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [],
"module_id": 3,
"services": [

{ "name": "workernode", "pid": 16110 }

,

{ "name": "controllernode", "pid": 16123 }

,

{ "name": "PrimProc", "pid": 16134 }

,

{ "name": "ExeMgr", "pid": 16182 }

,

{ "name": "WriteEngine", "pid": 16190 }

,

{ "name": "DMLProc", "pid": 16223 }

,

{ "name": "DDLProc", "pid": 16224 }

]
},
"num_nodes": 3
}

Comment by Todd Stoffel (Inactive) [ 2021-02-20 ]

HA requires shared file system or S3. This is documented.

Generated at Thu Feb 08 02:49:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.