Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
1.5.3
-
None
-
2020-8
Description
I'm testing multi-node ColumnStore on Ubuntu 20.04 with MariaDB Enterprise Server 10.5.4-2 and ColumnStore 1.5.3.
When I try to add a second node to the cluster, I get nonsensical errors like this:
$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/add-node \
|
--header 'Content-Type:application/json' \
|
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
|
--data '{"timeout":20, "node": "mcs2"}' \
|
| jq .
|
{
|
"error": "got an error during cluster startup when broadcasting config: (422, None)"
|
}
|
However, the node does sort of seem to get added to the cluster, despite this nonsensical error:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
|
--header 'Content-Type:application/json' \
|
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
|
| jq .
|
{
|
"timestamp": "2020-07-30 21:52:59.641434",
|
"mcs1": {
|
"timestamp": "2020-07-30 21:52:59.646960",
|
"uptime": 6140,
|
"dbrm_mode": "master",
|
"cluster_mode": "readwrite",
|
"dbroots": [],
|
"module_id": 1,
|
"services": [
|
{
|
"name": "workernode",
|
"pid": 9358
|
},
|
{
|
"name": "controllernode",
|
"pid": 9378
|
},
|
{
|
"name": "PrimProc",
|
"pid": 9396
|
},
|
{
|
"name": "ExeMgr",
|
"pid": 9431
|
},
|
{
|
"name": "WriteEngine",
|
"pid": 9443
|
},
|
{
|
"name": "DDLProc",
|
"pid": 9466
|
},
|
{
|
"name": "DMLProc",
|
"pid": 9476
|
}
|
]
|
},
|
"mcs2": {
|
"timestamp": "2020-07-30 21:52:59.672210",
|
"uptime": 5820,
|
"dbrm_mode": "slave",
|
"cluster_mode": "readonly",
|
"dbroots": [],
|
"module_id": 2,
|
"services": [
|
{
|
"name": "workernode",
|
"pid": 9706
|
},
|
{
|
"name": "PrimProc",
|
"pid": 9718
|
},
|
{
|
"name": "ExeMgr",
|
"pid": 9750
|
},
|
{
|
"name": "WriteEngine",
|
"pid": 9762
|
}
|
]
|
}
|
}
|
The meaning of this error message is not very clear.
The syslog contains errors which seem to indicate that the problem is DNS-related.
Here's the syslog from the primary (mcs1):
Jul 30 21:32:40 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:32:40] "GET /cmapi/0.4.0/cluster/status HTTP/1.1" 200 460 "" "curl/7.68.0"
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020 21:32:48] cmapi_server DEBUG put_add_node starts
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020 21:32:48] cmapi_server DEBUG put_begin starts
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020 21:32:48] cmapi_server DEBUG put_begin JSON body {'id': 479222, 'timeout': 299}
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020 21:32:48] cmapi_server DEBUG put_begin returns {'timestamp': '2020-07-30 21:32:48.637693'}
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:32:48] "PUT /cmapi/0.4.0/node/begin HTTP/1.1" 200 43 "" "python-requests/2.23.0"
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:32:48] "PUT /cmapi/0.4.0/node/begin HTTP/1.1" 200 43 "" "python-requests/2.23.0"
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root _add_Module_entries(): using ip address 10.10.10.11 and hostname mcs2
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020 21:32:48] cmapi_server DEBUG put_config starts
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020 21:32:48] cmapi_server DEBUG put_config JSON body {'manager': 'mcs-ubuntu2004-1', 'revision': '1', 'timeout': 300, 'config': '<?xml version="1.0" ?>\n<Columnstore Version="V1.0.0">\n <!--\n\tWARNING: Do not make changes to this file unless directed to do so by\n\tMariaDB service engineers. Incorrect settings can render your system\n\tunusable and will require a service call to correct.\n-->\n <ExeMgr1>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8601</Port>\n <Module>unassigned</Module>\n </ExeMgr1>\n <JobProc>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8602</Port>\n </JobProc>\n <ProcMgr>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8603</Port>\n </ProcMgr>\n <ProcMgr_Alarm>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8606</Port>\n </ProcMgr_Alarm>\n <ProcStatusControl>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8604</Port>\n </ProcStatusControl>\n <ProcStatusControlStandby>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8605</Port>\n </ProcStatusControlStandby>\n <!-- Disabled\n\t<ProcHeartbeatControl>\n\t\t<IPAddr>0.0.0.0</IPAddr>\n\t\t<Port>8605</Port>\n\t</ProcHeartbeatControl>\n\t-->\n <!-- ProcessMonitor Port: 8800 - 8820 is reserved to support External Modules-->\n <localhost_ProcessMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8800</Port>\n </localhost_ProcessMonitor>\n <dm1_ProcessMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8800</Port>\n </dm1_ProcessMonitor>\n <um1_ProcessMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8800</Port>\n </um1_ProcessMonitor>\n <pm1_ProcessMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8800</Port>\n </pm1_ProcessMonitor>\n <dm1_ServerMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8622</Port>\n </dm1_ServerMonitor>\n <um1_ServerMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8622</Port>\n </um1_ServerMonitor>\n <pm1_ServerMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8622</Port>\n </pm1_ServerMonitor>\n <DDLProc>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8612</Port>\n </DDLProc>\n <DMLProc>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8614</Port>\n </DMLProc>\n <BatchInsert>\n <RowsPerBatch>10000</RowsPerBatch>\n </BatchInsert>\n <PrimitiveServers>\n <Count>2</Count>\n <ConnectionsPerPrimProc>2</ConnectionsPerPrimProc>\n <ProcessorThreshold>128</ProcessorThreshold>\n <ProcessorQueueSize>10K</ProcessorQueueSize>\n <!-- minimum of extent size 8192 -->\n <DebugLevel>0</DebugLevel>\n <LBID_Shift>13</LBID_Shift>\n <ColScanBufferSizeBlocks>512</ColScanBufferSizeBlocks>\n <ColScanReadAheadBlocks>512</ColScanReadAheadBlocks>\n <!-- s/b factor of extent size 8192 -->\n <!-- <BPPCount>16</BPPCount> -->\n <!-- Default num cores * 2. A cap on the number of simultaneous primitives per jobstep -->\n <PrefetchThreshold>1</PrefetchThreshold>\n <PTTrace>0</PTTrace>\n <RotatingDestination>n</RotatingDestination>\n <!-- Iterate thru UM ports; set to \'n\' if UM/PM on same server -->\n <!-- <HighPriorityPercentage>60</HighPriorityPercentage> -->\n <!-- <MediumPriorityPercentage>30</MediumPriorityPercentage> -->\n <!-- <LowPriorityPercentage>10</LowPriorityPercentage> -->\n <DirectIO>y</DirectIO>\n <HighPriorityPercentage/>\n <MediumPriorityPercentage/>\n <LowPriorityPercentage/>\n </PrimitiveServers>\n <SystemConfig>\n <SystemName>columnstore-1</SystemName>\n <ParentOAMModuleName>pm1</ParentOAMModuleName>\n <StandbyOAMModuleName>unassigned</StandbyOAMModuleName>\n <PrimaryUMModuleName>pm1</PrimaryUMModuleName>\n <ModuleHeartbeatPeriod>1</ModuleHeartbeatPeriod>\n <ModuleHeartbeatCount>3</ModuleHeartbeatCount>\n <ModuleProcMonWaitCount>12</ModuleProcMonWaitCount>\n \t// 2.5 minutes\n <!-- Disabled\n\t\t<ProcessHeartbeatPeriod>-1</ProcessHeartbeatPeriod>\n\t\t-->\n <!-- Warning: Do not change this value once database is built -->\n <DBRootCount>2</DBRootCount>\n <DBRoot1>/var/lib/columnstore/data1</DBRoot1>\n <DBRMRoot>/var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves</DBRMRoot>\n <TableLockSaveFile>/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks</TableLockSaveFile>\n <DBRMTimeOut>15</DBRMTimeOut>\n <!-- in seconds -->\n <DBRMSnapshotInterval>100000</DBRMSnapshotInterval>\n <ExternalCriticalThreshold>90</ExternalCriticalThreshold>\n <ExternalMajorThreshold>80</ExternalMajorThreshold>\n <ExternalMinorThreshold>70</ExternalMinorThreshold>\n <!-- <TempDiskPath>/tmp</TempDiskPath>\n\t\t<WorkingDir>/tmp</WorkingDir>\n\t\t<TempFileDir>/tmp/columnstore_tmp_files</TempFileDir>\n\t\t-->\n <TransactionArchivePeriod>10</TransactionArchivePeriod>\n <NMSIPAddress>0.0.0.0</NMSIPAddress>\n <TempSaveSize>128M</TempSaveSize>\n <!-- default SWSDL max element save size -->\n <WaitPeriod>10</WaitPeriod>\n <!-- in seconds -->\n <ProcessRestartCount>10</ProcessRestartCount>\n <ProcessRestartPeriod>120</ProcessRestartPeriod>\n <SwapAction>restartSystem</SwapAction>\n <!-- OAM command (or \'none\') to run when swap space exceeds Major Threshold -->\n <ActivePmFailoverDisabled>n</ActivePmFailoverDisabled>\n <MemoryCheckPercent>95</MemoryCheckPercent>\n <!-- Max real memory to limit growth of buffers to -->\n <DataFileLog>OFF</DataFileLog>\n <!-- enable if you want to limit how much memory may be used for hdfs read/write memory buffers.\n \t\t<hdfsRdwrBufferMaxSize>8G</hdfsRdwrBufferMaxSize>\n\t\t-->\n <hdfsRdwrScratch>/rdwrscratch</hdfsRdwrScratch>\n <!-- Do not set to an hdfs file path -->\n <TempFileDir>/columnstore_tmp_files</TempFileDir>\n <SystemTempFileDir>/tmp/columnstore_tmp_files</SystemTempFileDir>\n <DBRoot2>/var/lib/columnstore/data2</DBRoot2>\n </SystemConfig>\n <SystemModuleConfig>\n <ModuleType1>dm</ModuleType1>\n <ModuleDesc1>Director Module</ModuleDesc1>\n <RunType1>SIMPLEX</RunType1>\n <ModuleCount1>0</ModuleCount1>\n <ModuleIPAddr1-1-1>0.0.0.0</ModuleIPAddr1-1-1>\n <ModuleHostName1-1-1>unassigned</ModuleHostName1-1-1>\n <ModuleDisableState1-1>ENABLED</ModuleDisableState1-1>\n <ModuleCPUCriticalThreshold1>0</ModuleCPUCriticalThreshold1>\n <ModuleCPUMajorThreshold1>0</ModuleCPUMajorThreshold1>\n <ModuleCPUMinorThreshold1>0</ModuleCPUMinorThreshold1>\n <ModuleCPUMinorClearThreshold1>0</ModuleCPUMinorClearThreshold1>\n <ModuleDiskCriticalThreshold1>90</ModuleDiskCriticalThreshold1>\n <ModuleDiskMajorThreshold1>80</ModuleDiskMajorThreshold1>\n <ModuleDiskMinorThreshold1>70</ModuleDiskMinorThreshold1>\n <ModuleMemCriticalThreshold1>90</ModuleMemCriticalThreshold1>\n <ModuleMemMajorThreshold1>0</ModuleMemMajorThreshold1>\n <ModuleMemMinorThreshold1>0</ModuleMemMinorThreshold1>\n <ModuleSwapCriticalThreshold1>90</ModuleSwapCriticalThreshold1>\n <ModuleSwapMajorThreshold1>80</ModuleSwapMajorThreshold1>\n <ModuleSwapMinorThreshold1>70</ModuleSwapMinorThreshold1>\n <ModuleDiskMonitorFileSystem1-1>/</ModuleDiskMonitorFileSystem1-1>\n <ModuleDBRootCount1-1>unassigned</ModuleDBRootCount1-1>\n <ModuleDBRootID1-1-1>unassigned</ModuleDBRootID1-1-1>\n <ModuleType2>um</ModuleType2>\n <ModuleDesc2>User Module</ModuleDesc2>\n <RunType2>SIMPLEX</RunType2>\n <ModuleCount2>0</ModuleCount2>\n <ModuleIPAddr1-1-2>0.0.0.0</ModuleIPAddr1-1-2>\n <ModuleHostName1-1-2>unassigned</ModuleHostName1-1-2>\n <ModuleDisa
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root -- Matching against ModuleIPAddr1-1-3, which says 10.10.10.10
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root Wrote 'pm1' to /var/lib/columnstore/local/module
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root apply Running stop on mcs-dmlproc. With sudo False.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root stop running systemctl stop mcs-dmlproc
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopping mcs-dmlproc...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: mcs-dmlproc.service: Succeeded.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopped mcs-dmlproc.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root apply Running stop on mcs-ddlproc. With sudo False.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root stop running systemctl stop mcs-ddlproc
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopping mcs-ddlproc...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: mcs-ddlproc.service: Succeeded.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopped mcs-ddlproc.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root apply Running stop on mcs-primproc. With sudo False.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root stop running systemctl stop mcs-primproc
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopping WriteEngineServer...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: mcs-writeengineserver.service: Succeeded.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopped WriteEngineServer.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopping mcs-exemgr...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: mcs-exemgr.service: Succeeded.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopped mcs-exemgr.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopping mcs-primproc...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: mcs-primproc.service: Succeeded.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopped mcs-primproc.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root apply Running stop on mcs-writeengineserver. With sudo False.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root stop running systemctl stop mcs-writeengineserver
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root apply Running stop on mcs-exemgr. With sudo False.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root stop running systemctl stop mcs-exemgr
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root apply Running stop on mcs-controllernode. With sudo False.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root stop running systemctl stop mcs-controllernode
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopping mcs-controllernode...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 controllernode[9137]: DBRM Controller: Waiting for threads to finish...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 controllernode[9137]: Exiting...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: mcs-controllernode.service: Succeeded.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopped mcs-controllernode.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root apply Running stop on mcs-workernode. With sudo False.
|
Jul 30 21:32:48 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:48] root stop running systemctl stop mcs-workernode
|
Jul 30 21:32:48 mcs-ubuntu2004-1 systemd[1]: Stopping mcs-workernode...
|
Jul 30 21:32:48 mcs-ubuntu2004-1 save_brm[9329]: Saved to /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves
|
Jul 30 21:32:49 mcs-ubuntu2004-1 systemd[1]: mcs-workernode.service: Succeeded.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 systemd[1]: Stopped mcs-workernode.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:49] root apply Running stop on mcs-storagemanager. With sudo False.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:49] root stop running systemctl stop mcs-storagemanager
|
Jul 30 21:32:49 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:49] root apply Running start on mcs-workernode. With sudo False.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:49] root start running systemctl start mcs-workernode
|
Jul 30 21:32:49 mcs-ubuntu2004-1 systemd[1]: Starting loadbrm...
|
Jul 30 21:32:49 mcs-ubuntu2004-1 python3[8333]: 127.0.0.1 - - [30/Jul/2020:21:32:49] "GET /cmapi/0.4.0/node/meta/em HTTP/1.1" 200 3436 "" "python-requests/2.22.0"
|
Jul 30 21:32:49 mcs-ubuntu2004-1 python3[8333]: 127.0.0.1 - - [30/Jul/2020:21:32:49] "GET /cmapi/0.4.0/node/meta/journal HTTP/1.1" 200 - "" "python-requests/2.22.0"
|
Jul 30 21:32:49 mcs-ubuntu2004-1 python3[8333]: 127.0.0.1 - - [30/Jul/2020:21:32:49] "GET /cmapi/0.4.0/node/meta/vbbm HTTP/1.1" 200 12 "" "python-requests/2.22.0"
|
Jul 30 21:32:49 mcs-ubuntu2004-1 python3[8333]: 127.0.0.1 - - [30/Jul/2020:21:32:49] "GET /cmapi/0.4.0/node/meta/vss HTTP/1.1" 200 8 "" "python-requests/2.22.0"
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9353]: OK.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9353]: Successfully loaded BRM snapshot
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9353]: Successfully replayed 0 BRM transactions
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9350]: Pulling em from the primary node.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9350]: Saving em to /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9350]: Pulling journal from the primary node.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9350]: Saving journal to /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_journal
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9350]: Pulling vbbm from the primary node.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9350]: Saving vbbm to /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9350]: Pulling vss from the primary node.
|
Jul 30 21:32:49 mcs-ubuntu2004-1 env[9350]: Saving vss to /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vs
|
Jul 30 21:32:51 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:51] root broadcast_new_config(): got an error pushing config file to mcs2: 500 Server Error: Internal Server Error for url: https://mcs2:8640/cmapi/0.4.0/node/config
|
Jul 30 21:32:51 mcs-ubuntu2004-1 systemd[1]: mcs-loadbrm.service: Succeeded.
|
Jul 30 21:32:51 mcs-ubuntu2004-1 systemd[1]: Finished loadbrm.
|
Jul 30 21:32:51 mcs-ubuntu2004-1 systemd[1]: Started mcs-workernode.
|
Jul 30 21:32:51 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:51] root apply Waiting for all workernodes to come up before starting controllernode on the primary.
|
Jul 30 21:32:51 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:51] root apply Trying...
|
Jul 30 21:32:51 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:51] root apply Trying...
|
Jul 30 21:32:54 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:54] root broadcast_new_config(): got an error pushing config file to mcs2: 500 Server Error: Internal Server Error for url: https://mcs2:8640/cmapi/0.4.0/node/config
|
Jul 30 21:32:57 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:57] root broadcast_new_config(): got an error pushing config file to mcs2: 500 Server Error: Internal Server Error for url: https://mcs2:8640/cmapi/0.4.0/node/config
|
Jul 30 21:32:59 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:32:59] root broadcast_new_config(): got an error pushing config file to mcs2: 500 Server Error: Internal Server Error for url: https://mcs2:8640/cmapi/0.4.0/node/config
|
Jul 30 21:33:01 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:01] root apply Trying...
|
Jul 30 21:33:01 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:01] root apply Running start on mcs-controllernode. With sudo False.
|
Jul 30 21:33:01 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:01] root start running systemctl start mcs-controllernode
|
Jul 30 21:33:01 mcs-ubuntu2004-1 systemd[1]: Started mcs-controllernode.
|
Jul 30 21:33:01 mcs-ubuntu2004-1 IDBFile[9378]: 01.594523 |0|0|0| D 35 CAL0002: Failed to open file: /var/lib/columnstore/data1/systemFiles/dbrm/tablelocks, exception: unable to open Buffered file
|
Jul 30 21:33:01 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:01] root apply Running start on mcs-primproc. With sudo False.
|
Jul 30 21:33:01 mcs-ubuntu2004-1 controllernode[9378]: 01.595079 |0|0|0| D 29 CAL0000: TableLockServer::load(): could not open the save file/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks
|
Jul 30 21:33:01 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:01] root start running systemctl start mcs-primproc
|
Jul 30 21:33:01 mcs-ubuntu2004-1 systemd[1]: Starting mcs-primproc...
|
Jul 30 21:33:01 mcs-ubuntu2004-1 env[9396]: Starting PrimitiveServer: st = 1, sq = 10, pw = 128, pq = 10240, nb = 62789, nt = 4, nc = 1, ra = 512, db = 128, mb = 512, rd = 0, tr = 0, ss = 67108864, bp = 4
|
Jul 30 21:33:02 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:02] root broadcast_new_config(): got an error pushing config file to mcs2: 500 Server Error: Internal Server Error for url: https://mcs2:8640/cmapi/0.4.0/node/config
|
Jul 30 21:33:03 mcs-ubuntu2004-1 systemd[1]: Started mcs-primproc.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root apply Running start on mcs-exemgr. With sudo False.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root start running systemctl start mcs-exemgr
|
Jul 30 21:33:03 mcs-ubuntu2004-1 systemd[1]: Started mcs-exemgr.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root apply Running start on mcs-writeengineserver. With sudo False.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root start running systemctl start mcs-writeengineserver
|
Jul 30 21:33:03 mcs-ubuntu2004-1 systemd[1]: Started WriteEngineServer.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 env[9431]: Starting ExeMgr: st = 50, qs = 20, mx = 95, cf = /etc/columnstore/Columnstore.xml
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root apply Trying...
|
Jul 30 21:33:03 mcs-ubuntu2004-1 controllernode[9378]: 03.697082 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root apply Running start on mcs-ddlproc. With sudo False.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root start running systemctl start mcs-ddlproc
|
Jul 30 21:33:03 mcs-ubuntu2004-1 systemd[1]: Started mcs-ddlproc.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 env[9443]: WriteEngineServer is ready
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root apply Trying...
|
Jul 30 21:33:03 mcs-ubuntu2004-1 controllernode[9378]: 03.725450 |0|0|0| C 29 CAL0000: InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root apply Running start on mcs-dmlproc. With sudo False.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root start running systemctl start mcs-dmlproc
|
Jul 30 21:33:03 mcs-ubuntu2004-1 systemd[1]: Started mcs-dmlproc.
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:33:03] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 200 43 "" "python-requests/2.23.0"
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root broadcast_new_config(): successfully pushed the new config file to ['mcs1']
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root broadcast_new_config(): failed to push the new config file to ['mcs2']
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:33:03] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 200 43 "" "python-requests/2.23.0"
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020 21:33:03] cmapi_server ERROR put_add_node got an error during cluster startup when broadcasting config
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:33:03] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:03 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:03] root rollback_txn_attempt(): got error during request to mcs1: 422 Client Error: Unprocessable Entity for url: https://mcs1:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:04 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:33:04] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:04 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:04] root rollback_txn_attempt(): got error during request to mcs1: 422 Client Error: Unprocessable Entity for url: https://mcs1:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:05 mcs-ubuntu2004-1 env[9466]: DDLProc is ready...
|
Jul 30 21:33:05 mcs-ubuntu2004-1 DMLProc[9476]: 05.762290 |0|0|0| I 20 CAL0002: DMLProc starts rollbackAll.
|
Jul 30 21:33:05 mcs-ubuntu2004-1 DMLProc[9476]: 05.787035 |0|0|0| I 20 CAL0002: DMLProc will rollback 0 tables.
|
Jul 30 21:33:05 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:33:05] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:05 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:05] root rollback_txn_attempt(): got error during request to mcs1: 422 Client Error: Unprocessable Entity for url: https://mcs1:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:05 mcs-ubuntu2004-1 DMLProc[9476]: 05.811843 |0|0|0| I 20 CAL0002: DMLProc finished rollbackAll.
|
Jul 30 21:33:05 mcs-ubuntu2004-1 env[9476]: DMLProc is ready...
|
Jul 30 21:33:06 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:33:06] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:06 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:06] root rollback_txn_attempt(): got error during request to mcs1: 422 Client Error: Unprocessable Entity for url: https://mcs1:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:07 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:33:07] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:07 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:07] root rollback_txn_attempt(): got error during request to mcs1: 422 Client Error: Unprocessable Entity for url: https://mcs1:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:08 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:08] root rollback_txn_attempt(): got error during request to mcs2: 422 Client Error: Unprocessable Entity for url: https://mcs2:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:09 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:09] root rollback_txn_attempt(): got error during request to mcs2: 422 Client Error: Unprocessable Entity for url: https://mcs2:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:10 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:10] root rollback_txn_attempt(): got error during request to mcs2: 422 Client Error: Unprocessable Entity for url: https://mcs2:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:11 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:11] root rollback_txn_attempt(): got error during request to mcs2: 422 Client Error: Unprocessable Entity for url: https://mcs2:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:12 mcs-ubuntu2004-1 python3[8333]: [30/Jul/2020 21:33:12] root rollback_txn_attempt(): got error during request to mcs2: 422 Client Error: Unprocessable Entity for url: https://mcs2:8640/cmapi/0.4.0/node/rollback
|
Jul 30 21:33:13 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020 21:33:13] cmapi_server ERROR put_add_node got an error during cluster startup when broadcasting config: (422, None)
|
Jul 30 21:33:13 mcs-ubuntu2004-1 python3[8333]: 10.10.10.10 - - [30/Jul/2020:21:33:13] "PUT /cmapi/0.4.0/cluster/add-node HTTP/1.1" 422 86 "" "curl/7.68.0"
|
Here's the syslog from the replica (mcs2):
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Trying...
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020:21:32:54] HTTP
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: Traceback (most recent call last):
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cprequest.py", line 638, in respond
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: self._do_respond(path_info)
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cprequest.py", line 697, in _do_respond
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: response.body = self.handler()
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/lib/encoding.py", line 219, in __call__
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: self.body = self.oldhandler(*args, **kwargs)
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/lib/jsontools.py", line 59, in json_handler
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: value = cherrypy.serving.request._json_inner_handler(*args, **kwargs)
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cpdispatch.py", line 54, in __call__
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: return self.callable(*self.args, **self.kwargs)
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/cmapi_server/controllers/endpoints.py", line 282, in put_config
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: msgs = list(os_operations.apply(actions, **kwargs))
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/mcs_node_control/models/os_operations.py", line 99, in apply
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: int(controllernode['Port'])))
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: socket.gaierror: [Errno -3] Temporary failure in name resolution
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020:21:32:54] HTTP
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: Request Headers:
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: Remote-Addr: 10.10.10.10
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: HOST: mcs2:8640
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: USER-AGENT: python-requests/2.23.0
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: ACCEPT-ENCODING: gzip, deflate
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: ACCEPT: */*
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: CONNECTION: keep-alive
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: X-API-KEY: '93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd'
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: Content-Type: application/json
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: Content-Length: 20080
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:32:54] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 500 513 "" "python-requests/2.23.0"
|
Jul 30 21:32:54 mcs-ubuntu2004-2 joblist[9350]: 54.325394 |0|0|0| W 05 CAL0000: /home/jenkins/workspace/MariaDBE-Custom-DEB/label/ubuntu-2004/MariaDBEnterprise/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 299 Could not connect to PMS3: Connection refused
|
Jul 30 21:32:54 mcs-ubuntu2004-2 env[9350]: Could not connect to PMS3: Connection refused
|
Jul 30 21:32:54 mcs-ubuntu2004-2 env[9350]: Starting ExeMgr: st = 50, qs = 20, mx = 95, cf = /etc/columnstore/Columnstore.xml
|
Jul 30 21:32:54 mcs-ubuntu2004-2 messagequeue[9350]: 54.328247 |0|0|0| E 31 CAL0000: MessageQueueClient::setup(): Temporary failure in name resolution
|
Jul 30 21:32:54 mcs-ubuntu2004-2 env[9350]: DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Address family not supported by protocol to: InetStreamSocket: sd: 12 inet: 225.127.0.0 port: 45290
|
Jul 30 21:32:54 mcs-ubuntu2004-2 messagequeue[9350]: 54.329012 |0|0|0| E 31 CAL0000: MessageQueueClient::setup(): Temporary failure in name resolution
|
Jul 30 21:32:54 mcs-ubuntu2004-2 env[9350]: DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Address family not supported by protocol to: InetStreamSocket: sd: 12 inet: 225.127.0.0 port: 45290
|
Jul 30 21:32:54 mcs-ubuntu2004-2 controllernode[9350]: 54.329373 |0|0|0| E 29 CAL0000: DBRM: error: SessionManager::setSystemState() failed (network)
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020 21:32:54] cmapi_server DEBUG put_config starts
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020 21:32:54] cmapi_server DEBUG put_config JSON body {'manager': 'mcs-ubuntu2004-1', 'revision': '1', 'timeout': 300, 'config': '<?xml version="1.0" ?>\n<Columnstore Version="V1.0.0">\n <!--\n\tWARNING: Do not make changes to this file unless directed to do so by\n\tMariaDB service engineers. Incorrect settings can render your system\n\tunusable and will require a service call to correct.\n-->\n <ExeMgr1>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8601</Port>\n <Module>unassigned</Module>\n </ExeMgr1>\n <JobProc>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8602</Port>\n </JobProc>\n <ProcMgr>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8603</Port>\n </ProcMgr>\n <ProcMgr_Alarm>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8606</Port>\n </ProcMgr_Alarm>\n <ProcStatusControl>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8604</Port>\n </ProcStatusControl>\n <ProcStatusControlStandby>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8605</Port>\n </ProcStatusControlStandby>\n <!-- Disabled\n\t<ProcHeartbeatControl>\n\t\t<IPAddr>0.0.0.0</IPAddr>\n\t\t<Port>8605</Port>\n\t</ProcHeartbeatControl>\n\t-->\n <!-- ProcessMonitor Port: 8800 - 8820 is reserved to support External Modules-->\n <localhost_ProcessMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8800</Port>\n </localhost_ProcessMonitor>\n <dm1_ProcessMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8800</Port>\n </dm1_ProcessMonitor>\n <um1_ProcessMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8800</Port>\n </um1_ProcessMonitor>\n <pm1_ProcessMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8800</Port>\n </pm1_ProcessMonitor>\n <dm1_ServerMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8622</Port>\n </dm1_ServerMonitor>\n <um1_ServerMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8622</Port>\n </um1_ServerMonitor>\n <pm1_ServerMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8622</Port>\n </pm1_ServerMonitor>\n <DDLProc>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8612</Port>\n </DDLProc>\n <DMLProc>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8614</Port>\n </DMLProc>\n <BatchInsert>\n <RowsPerBatch>10000</RowsPerBatch>\n </BatchInsert>\n <PrimitiveServers>\n <Count>2</Count>\n <ConnectionsPerPrimProc>2</ConnectionsPerPrimProc>\n <ProcessorThreshold>128</ProcessorThreshold>\n <ProcessorQueueSize>10K</ProcessorQueueSize>\n <!-- minimum of extent size 8192 -->\n <DebugLevel>0</DebugLevel>\n <LBID_Shift>13</LBID_Shift>\n <ColScanBufferSizeBlocks>512</ColScanBufferSizeBlocks>\n <ColScanReadAheadBlocks>512</ColScanReadAheadBlocks>\n <!-- s/b factor of extent size 8192 -->\n <!-- <BPPCount>16</BPPCount> -->\n <!-- Default num cores * 2. A cap on the number of simultaneous primitives per jobstep -->\n <PrefetchThreshold>1</PrefetchThreshold>\n <PTTrace>0</PTTrace>\n <RotatingDestination>n</RotatingDestination>\n <!-- Iterate thru UM ports; set to \'n\' if UM/PM on same server -->\n <!-- <HighPriorityPercentage>60</HighPriorityPercentage> -->\n <!-- <MediumPriorityPercentage>30</MediumPriorityPercentage> -->\n <!-- <LowPriorityPercentage>10</LowPriorityPercentage> -->\n <DirectIO>y</DirectIO>\n <HighPriorityPercentage/>\n <MediumPriorityPercentage/>\n <LowPriorityPercentage/>\n </PrimitiveServers>\n <SystemConfig>\n <SystemName>columnstore-1</SystemName>\n <ParentOAMModuleName>pm1</ParentOAMModuleName>\n <StandbyOAMModuleName>unassigned</StandbyOAMModuleName>\n <PrimaryUMModuleName>pm1</PrimaryUMModuleName>\n <ModuleHeartbeatPeriod>1</ModuleHeartbeatPeriod>\n <ModuleHeartbeatCount>3</ModuleHeartbeatCount>\n <ModuleProcMonWaitCount>12</ModuleProcMonWaitCount>\n \t// 2.5 minutes\n <!-- Disabled\n\t\t<ProcessHeartbeatPeriod>-1</ProcessHeartbeatPeriod>\n\t\t-->\n <!-- Warning: Do not change this value once database is built -->\n <DBRootCount>2</DBRootCount>\n <DBRoot1>/var/lib/columnstore/data1</DBRoot1>\n <DBRMRoot>/var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves</DBRMRoot>\n <TableLockSaveFile>/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks</TableLockSaveFile>\n <DBRMTimeOut>15</DBRMTimeOut>\n <!-- in seconds -->\n <DBRMSnapshotInterval>100000</DBRMSnapshotInterval>\n <ExternalCriticalThreshold>90</ExternalCriticalThreshold>\n <ExternalMajorThreshold>80</ExternalMajorThreshold>\n <ExternalMinorThreshold>70</ExternalMinorThreshold>\n <!-- <TempDiskPath>/tmp</TempDiskPath>\n\t\t<WorkingDir>/tmp</WorkingDir>\n\t\t<TempFileDir>/tmp/columnstore_tmp_files</TempFileDir>\n\t\t-->\n <TransactionArchivePeriod>10</TransactionArchivePeriod>\n <NMSIPAddress>0.0.0.0</NMSIPAddress>\n <TempSaveSize>128M</TempSaveSize>\n <!-- default SWSDL max element save size -->\n <WaitPeriod>10</WaitPeriod>\n <!-- in seconds -->\n <ProcessRestartCount>10</ProcessRestartCount>\n <ProcessRestartPeriod>120</ProcessRestartPeriod>\n <SwapAction>restartSystem</SwapAction>\n <!-- OAM command (or \'none\') to run when swap space exceeds Major Threshold -->\n <ActivePmFailoverDisabled>n</ActivePmFailoverDisabled>\n <MemoryCheckPercent>95</MemoryCheckPercent>\n <!-- Max real memory to limit growth of buffers to -->\n <DataFileLog>OFF</DataFileLog>\n <!-- enable if you want to limit how much memory may be used for hdfs read/write memory buffers.\n \t\t<hdfsRdwrBufferMaxSize>8G</hdfsRdwrBufferMaxSize>\n\t\t-->\n <hdfsRdwrScratch>/rdwrscratch</hdfsRdwrScratch>\n <!-- Do not set to an hdfs file path -->\n <TempFileDir>/columnstore_tmp_files</TempFileDir>\n <SystemTempFileDir>/tmp/columnstore_tmp_files</SystemTempFileDir>\n <DBRoot2>/var/lib/columnstore/data2</DBRoot2>\n </SystemConfig>\n <SystemModuleConfig>\n <ModuleType1>dm</ModuleType1>\n <ModuleDesc1>Director Module</ModuleDesc1>\n <RunType1>SIMPLEX</RunType1>\n <ModuleCount1>0</ModuleCount1>\n <ModuleIPAddr1-1-1>0.0.0.0</ModuleIPAddr1-1-1>\n <ModuleHostName1-1-1>unassigned</ModuleHostName1-1-1>\n <ModuleDisableState1-1>ENABLED</ModuleDisableState1-1>\n <ModuleCPUCriticalThreshold1>0</ModuleCPUCriticalThreshold1>\n <ModuleCPUMajorThreshold1>0</ModuleCPUMajorThreshold1>\n <ModuleCPUMinorThreshold1>0</ModuleCPUMinorThreshold1>\n <ModuleCPUMinorClearThreshold1>0</ModuleCPUMinorClearThreshold1>\n <ModuleDiskCriticalThreshold1>90</ModuleDiskCriticalThreshold1>\n <ModuleDiskMajorThreshold1>80</ModuleDiskMajorThreshold1>\n <ModuleDiskMinorThreshold1>70</ModuleDiskMinorThreshold1>\n <ModuleMemCriticalThreshold1>90</ModuleMemCriticalThreshold1>\n <ModuleMemMajorThreshold1>0</ModuleMemMajorThreshold1>\n <ModuleMemMinorThreshold1>0</ModuleMemMinorThreshold1>\n <ModuleSwapCriticalThreshold1>90</ModuleSwapCriticalThreshold1>\n <ModuleSwapMajorThreshold1>80</ModuleSwapMajorThreshold1>\n <ModuleSwapMinorThreshold1>70</ModuleSwapMinorThreshold1>\n <ModuleDiskMonitorFileSystem1-1>/</ModuleDiskMonitorFileSystem1-1>\n <ModuleDBRootCount1-1>unassigned</ModuleDBRootCount1-1>\n <ModuleDBRootID1-1-1>unassigned</ModuleDBRootID1-1-1>\n <ModuleType2>um</ModuleType2>\n <ModuleDesc2>User Module</ModuleDesc2>\n <RunType2>SIMPLEX</RunType2>\n <ModuleCount2>0</ModuleCount2>\n <ModuleIPAddr1-1-2>0.0.0.0</ModuleIPAddr1-1-2>\n <ModuleHostName1-1-2>unassigned</ModuleHostName1-1-2>\n <ModuleDisa
|
Jul 30 21:32:54 mcs-ubuntu2004-2 env[9361]: WriteEngineServer is ready
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root -- Matching against ModuleIPAddr2-1-3, which says 10.10.10.11
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root Wrote 'pm2' to /var/lib/columnstore/local/module
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running stop on mcs-dmlproc. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root stop running systemctl stop mcs-dmlproc
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running stop on mcs-ddlproc. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root stop running systemctl stop mcs-ddlproc
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running stop on mcs-primproc. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root stop running systemctl stop mcs-primproc
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Stopping WriteEngineServer...
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: mcs-writeengineserver.service: Succeeded.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Stopped WriteEngineServer.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-exemgr...
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: mcs-exemgr.service: Succeeded.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-exemgr.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-primproc...
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: mcs-primproc.service: Succeeded.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-primproc.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running stop on mcs-writeengineserver. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root stop running systemctl stop mcs-writeengineserver
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running stop on mcs-exemgr. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root stop running systemctl stop mcs-exemgr
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running stop on mcs-controllernode. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root stop running systemctl stop mcs-controllernode
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running stop on mcs-workernode. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root stop running systemctl stop mcs-workernode
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-workernode...
|
Jul 30 21:32:54 mcs-ubuntu2004-2 controllernode[9417]: 54.468588 |0|0|0| C 29 CAL0000: ExtentMap::save(): got request to save an empty BRM
|
Jul 30 21:32:54 mcs-ubuntu2004-2 save_brm[9417]: ExtentMap::save(): got request to save an empty BRM
|
Jul 30 21:32:54 mcs-ubuntu2004-2 save_brm[9417]: Save failed
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: mcs-workernode.service: Succeeded.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-workernode.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running stop on mcs-storagemanager. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root stop running systemctl stop mcs-storagemanager
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root apply Running start on mcs-workernode. With sudo False.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:54] root start running systemctl start mcs-workernode
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Starting loadbrm...
|
Jul 30 21:32:54 mcs-ubuntu2004-2 env[9438]: Failed to load meta data from the primary node mcs-ubuntu2004-1.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 env[9438]: Pulling em from the primary node.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: mcs-loadbrm.service: Main process exited, code=exited, status=1/FAILURE
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: mcs-loadbrm.service: Failed with result 'exit-code'.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Failed to start loadbrm.
|
Jul 30 21:32:54 mcs-ubuntu2004-2 systemd[1]: Started mcs-workernode.
|
Jul 30 21:32:55 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:55] root apply Running start on mcs-primproc. With sudo False.
|
Jul 30 21:32:55 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:55] root start running systemctl start mcs-primproc
|
Jul 30 21:32:55 mcs-ubuntu2004-2 systemd[1]: Starting mcs-primproc...
|
Jul 30 21:32:55 mcs-ubuntu2004-2 env[9451]: Starting PrimitiveServer: st = 1, sq = 10, pw = 128, pq = 10240, nb = 62789, nt = 4, nc = 1, ra = 512, db = 128, mb = 512, rd = 0, tr = 0, ss = 67108864, bp = 4
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Started mcs-primproc.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running start on mcs-exemgr. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root start running systemctl start mcs-exemgr
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Started mcs-exemgr.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running start on mcs-writeengineserver. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root start running systemctl start mcs-writeengineserver
|
Jul 30 21:32:57 mcs-ubuntu2004-2 joblist[9483]: 57.098715 |0|0|0| W 05 CAL0000: /home/jenkins/workspace/MariaDBE-Custom-DEB/label/ubuntu-2004/MariaDBEnterprise/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 299 Could not connect to PMS1: Connection refused
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9483]: Could not connect to PMS1: Connection refused
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Started WriteEngineServer.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 joblist[9483]: 57.110449 |0|0|0| W 05 CAL0000: /home/jenkins/workspace/MariaDBE-Custom-DEB/label/ubuntu-2004/MariaDBEnterprise/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 299 Could not connect to PMS3: Connection refused
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9483]: Could not connect to PMS3: Connection refused
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9483]: Starting ExeMgr: st = 50, qs = 20, mx = 95, cf = /etc/columnstore/Columnstore.xml
|
Jul 30 21:32:57 mcs-ubuntu2004-2 messagequeue[9483]: 57.113722 |0|0|0| E 31 CAL0000: MessageQueueClient::setup(): Temporary failure in name resolution
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9483]: DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Address family not supported by protocol to: InetStreamSocket: sd: 12 inet: 212.127.0.0 port: 20713
|
Jul 30 21:32:57 mcs-ubuntu2004-2 messagequeue[9483]: 57.115360 |0|0|0| E 31 CAL0000: MessageQueueClient::setup(): Temporary failure in name resolution
|
Jul 30 21:32:57 mcs-ubuntu2004-2 controllernode[9483]: 57.115474 |0|0|0| E 29 CAL0000: DBRM: error: SessionManager::setSystemState() failed (network)
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9483]: DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Address family not supported by protocol to: InetStreamSocket: sd: 12 inet: 212.127.0.0 port: 20713
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Trying...
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020:21:32:57] HTTP
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: Traceback (most recent call last):
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cprequest.py", line 638, in respond
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: self._do_respond(path_info)
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cprequest.py", line 697, in _do_respond
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: response.body = self.handler()
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/lib/encoding.py", line 219, in __call__
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: self.body = self.oldhandler(*args, **kwargs)
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/lib/jsontools.py", line 59, in json_handler
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: value = cherrypy.serving.request._json_inner_handler(*args, **kwargs)
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cpdispatch.py", line 54, in __call__
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: return self.callable(*self.args, **self.kwargs)
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/cmapi_server/controllers/endpoints.py", line 282, in put_config
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: msgs = list(os_operations.apply(actions, **kwargs))
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/mcs_node_control/models/os_operations.py", line 99, in apply
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: int(controllernode['Port'])))
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: socket.gaierror: [Errno -3] Temporary failure in name resolution
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020:21:32:57] HTTP
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: Request Headers:
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: Remote-Addr: 10.10.10.10
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: HOST: mcs2:8640
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: USER-AGENT: python-requests/2.23.0
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: ACCEPT-ENCODING: gzip, deflate
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: ACCEPT: */*
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: CONNECTION: keep-alive
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: X-API-KEY: '93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd'
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: Content-Type: application/json
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: Content-Length: 20080
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:32:57] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 500 513 "" "python-requests/2.23.0"
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020 21:32:57] cmapi_server DEBUG put_config starts
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020 21:32:57] cmapi_server DEBUG put_config JSON body {'manager': 'mcs-ubuntu2004-1', 'revision': '1', 'timeout': 300, 'config': '<?xml version="1.0" ?>\n<Columnstore Version="V1.0.0">\n <!--\n\tWARNING: Do not make changes to this file unless directed to do so by\n\tMariaDB service engineers. Incorrect settings can render your system\n\tunusable and will require a service call to correct.\n-->\n <ExeMgr1>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8601</Port>\n <Module>unassigned</Module>\n </ExeMgr1>\n <JobProc>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8602</Port>\n </JobProc>\n <ProcMgr>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8603</Port>\n </ProcMgr>\n <ProcMgr_Alarm>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8606</Port>\n </ProcMgr_Alarm>\n <ProcStatusControl>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8604</Port>\n </ProcStatusControl>\n <ProcStatusControlStandby>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8605</Port>\n </ProcStatusControlStandby>\n <!-- Disabled\n\t<ProcHeartbeatControl>\n\t\t<IPAddr>0.0.0.0</IPAddr>\n\t\t<Port>8605</Port>\n\t</ProcHeartbeatControl>\n\t-->\n <!-- ProcessMonitor Port: 8800 - 8820 is reserved to support External Modules-->\n <localhost_ProcessMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8800</Port>\n </localhost_ProcessMonitor>\n <dm1_ProcessMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8800</Port>\n </dm1_ProcessMonitor>\n <um1_ProcessMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8800</Port>\n </um1_ProcessMonitor>\n <pm1_ProcessMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8800</Port>\n </pm1_ProcessMonitor>\n <dm1_ServerMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8622</Port>\n </dm1_ServerMonitor>\n <um1_ServerMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8622</Port>\n </um1_ServerMonitor>\n <pm1_ServerMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8622</Port>\n </pm1_ServerMonitor>\n <DDLProc>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8612</Port>\n </DDLProc>\n <DMLProc>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8614</Port>\n </DMLProc>\n <BatchInsert>\n <RowsPerBatch>10000</RowsPerBatch>\n </BatchInsert>\n <PrimitiveServers>\n <Count>2</Count>\n <ConnectionsPerPrimProc>2</ConnectionsPerPrimProc>\n <ProcessorThreshold>128</ProcessorThreshold>\n <ProcessorQueueSize>10K</ProcessorQueueSize>\n <!-- minimum of extent size 8192 -->\n <DebugLevel>0</DebugLevel>\n <LBID_Shift>13</LBID_Shift>\n <ColScanBufferSizeBlocks>512</ColScanBufferSizeBlocks>\n <ColScanReadAheadBlocks>512</ColScanReadAheadBlocks>\n <!-- s/b factor of extent size 8192 -->\n <!-- <BPPCount>16</BPPCount> -->\n <!-- Default num cores * 2. A cap on the number of simultaneous primitives per jobstep -->\n <PrefetchThreshold>1</PrefetchThreshold>\n <PTTrace>0</PTTrace>\n <RotatingDestination>n</RotatingDestination>\n <!-- Iterate thru UM ports; set to \'n\' if UM/PM on same server -->\n <!-- <HighPriorityPercentage>60</HighPriorityPercentage> -->\n <!-- <MediumPriorityPercentage>30</MediumPriorityPercentage> -->\n <!-- <LowPriorityPercentage>10</LowPriorityPercentage> -->\n <DirectIO>y</DirectIO>\n <HighPriorityPercentage/>\n <MediumPriorityPercentage/>\n <LowPriorityPercentage/>\n </PrimitiveServers>\n <SystemConfig>\n <SystemName>columnstore-1</SystemName>\n <ParentOAMModuleName>pm1</ParentOAMModuleName>\n <StandbyOAMModuleName>unassigned</StandbyOAMModuleName>\n <PrimaryUMModuleName>pm1</PrimaryUMModuleName>\n <ModuleHeartbeatPeriod>1</ModuleHeartbeatPeriod>\n <ModuleHeartbeatCount>3</ModuleHeartbeatCount>\n <ModuleProcMonWaitCount>12</ModuleProcMonWaitCount>\n \t// 2.5 minutes\n <!-- Disabled\n\t\t<ProcessHeartbeatPeriod>-1</ProcessHeartbeatPeriod>\n\t\t-->\n <!-- Warning: Do not change this value once database is built -->\n <DBRootCount>2</DBRootCount>\n <DBRoot1>/var/lib/columnstore/data1</DBRoot1>\n <DBRMRoot>/var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves</DBRMRoot>\n <TableLockSaveFile>/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks</TableLockSaveFile>\n <DBRMTimeOut>15</DBRMTimeOut>\n <!-- in seconds -->\n <DBRMSnapshotInterval>100000</DBRMSnapshotInterval>\n <ExternalCriticalThreshold>90</ExternalCriticalThreshold>\n <ExternalMajorThreshold>80</ExternalMajorThreshold>\n <ExternalMinorThreshold>70</ExternalMinorThreshold>\n <!-- <TempDiskPath>/tmp</TempDiskPath>\n\t\t<WorkingDir>/tmp</WorkingDir>\n\t\t<TempFileDir>/tmp/columnstore_tmp_files</TempFileDir>\n\t\t-->\n <TransactionArchivePeriod>10</TransactionArchivePeriod>\n <NMSIPAddress>0.0.0.0</NMSIPAddress>\n <TempSaveSize>128M</TempSaveSize>\n <!-- default SWSDL max element save size -->\n <WaitPeriod>10</WaitPeriod>\n <!-- in seconds -->\n <ProcessRestartCount>10</ProcessRestartCount>\n <ProcessRestartPeriod>120</ProcessRestartPeriod>\n <SwapAction>restartSystem</SwapAction>\n <!-- OAM command (or \'none\') to run when swap space exceeds Major Threshold -->\n <ActivePmFailoverDisabled>n</ActivePmFailoverDisabled>\n <MemoryCheckPercent>95</MemoryCheckPercent>\n <!-- Max real memory to limit growth of buffers to -->\n <DataFileLog>OFF</DataFileLog>\n <!-- enable if you want to limit how much memory may be used for hdfs read/write memory buffers.\n \t\t<hdfsRdwrBufferMaxSize>8G</hdfsRdwrBufferMaxSize>\n\t\t-->\n <hdfsRdwrScratch>/rdwrscratch</hdfsRdwrScratch>\n <!-- Do not set to an hdfs file path -->\n <TempFileDir>/columnstore_tmp_files</TempFileDir>\n <SystemTempFileDir>/tmp/columnstore_tmp_files</SystemTempFileDir>\n <DBRoot2>/var/lib/columnstore/data2</DBRoot2>\n </SystemConfig>\n <SystemModuleConfig>\n <ModuleType1>dm</ModuleType1>\n <ModuleDesc1>Director Module</ModuleDesc1>\n <RunType1>SIMPLEX</RunType1>\n <ModuleCount1>0</ModuleCount1>\n <ModuleIPAddr1-1-1>0.0.0.0</ModuleIPAddr1-1-1>\n <ModuleHostName1-1-1>unassigned</ModuleHostName1-1-1>\n <ModuleDisableState1-1>ENABLED</ModuleDisableState1-1>\n <ModuleCPUCriticalThreshold1>0</ModuleCPUCriticalThreshold1>\n <ModuleCPUMajorThreshold1>0</ModuleCPUMajorThreshold1>\n <ModuleCPUMinorThreshold1>0</ModuleCPUMinorThreshold1>\n <ModuleCPUMinorClearThreshold1>0</ModuleCPUMinorClearThreshold1>\n <ModuleDiskCriticalThreshold1>90</ModuleDiskCriticalThreshold1>\n <ModuleDiskMajorThreshold1>80</ModuleDiskMajorThreshold1>\n <ModuleDiskMinorThreshold1>70</ModuleDiskMinorThreshold1>\n <ModuleMemCriticalThreshold1>90</ModuleMemCriticalThreshold1>\n <ModuleMemMajorThreshold1>0</ModuleMemMajorThreshold1>\n <ModuleMemMinorThreshold1>0</ModuleMemMinorThreshold1>\n <ModuleSwapCriticalThreshold1>90</ModuleSwapCriticalThreshold1>\n <ModuleSwapMajorThreshold1>80</ModuleSwapMajorThreshold1>\n <ModuleSwapMinorThreshold1>70</ModuleSwapMinorThreshold1>\n <ModuleDiskMonitorFileSystem1-1>/</ModuleDiskMonitorFileSystem1-1>\n <ModuleDBRootCount1-1>unassigned</ModuleDBRootCount1-1>\n <ModuleDBRootID1-1-1>unassigned</ModuleDBRootID1-1-1>\n <ModuleType2>um</ModuleType2>\n <ModuleDesc2>User Module</ModuleDesc2>\n <RunType2>SIMPLEX</RunType2>\n <ModuleCount2>0</ModuleCount2>\n <ModuleIPAddr1-1-2>0.0.0.0</ModuleIPAddr1-1-2>\n <ModuleHostName1-1-2>unassigned</ModuleHostName1-1-2>\n <ModuleDisa
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9498]: WriteEngineServer is ready
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root -- Matching against ModuleIPAddr2-1-3, which says 10.10.10.11
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root Wrote 'pm2' to /var/lib/columnstore/local/module
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running stop on mcs-dmlproc. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root stop running systemctl stop mcs-dmlproc
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running stop on mcs-ddlproc. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root stop running systemctl stop mcs-ddlproc
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running stop on mcs-primproc. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root stop running systemctl stop mcs-primproc
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Stopping WriteEngineServer...
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: mcs-writeengineserver.service: Succeeded.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Stopped WriteEngineServer.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-exemgr...
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: mcs-exemgr.service: Succeeded.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-exemgr.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-primproc...
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: mcs-primproc.service: Succeeded.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-primproc.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running stop on mcs-writeengineserver. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root stop running systemctl stop mcs-writeengineserver
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running stop on mcs-exemgr. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root stop running systemctl stop mcs-exemgr
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running stop on mcs-controllernode. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root stop running systemctl stop mcs-controllernode
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running stop on mcs-workernode. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root stop running systemctl stop mcs-workernode
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-workernode...
|
Jul 30 21:32:57 mcs-ubuntu2004-2 controllernode[9550]: 57.270673 |0|0|0| C 29 CAL0000: ExtentMap::save(): got request to save an empty BRM
|
Jul 30 21:32:57 mcs-ubuntu2004-2 save_brm[9550]: ExtentMap::save(): got request to save an empty BRM
|
Jul 30 21:32:57 mcs-ubuntu2004-2 save_brm[9550]: Save failed
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: mcs-workernode.service: Succeeded.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-workernode.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running stop on mcs-storagemanager. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root stop running systemctl stop mcs-storagemanager
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running start on mcs-workernode. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root start running systemctl start mcs-workernode
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Starting loadbrm...
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9571]: Failed to load meta data from the primary node mcs-ubuntu2004-1.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9571]: Pulling em from the primary node.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: mcs-loadbrm.service: Main process exited, code=exited, status=1/FAILURE
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: mcs-loadbrm.service: Failed with result 'exit-code'.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Failed to start loadbrm.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Started mcs-workernode.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root apply Running start on mcs-primproc. With sudo False.
|
Jul 30 21:32:57 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:57] root start running systemctl start mcs-primproc
|
Jul 30 21:32:57 mcs-ubuntu2004-2 systemd[1]: Starting mcs-primproc...
|
Jul 30 21:32:57 mcs-ubuntu2004-2 env[9584]: Starting PrimitiveServer: st = 1, sq = 10, pw = 128, pq = 10240, nb = 62789, nt = 4, nc = 1, ra = 512, db = 128, mb = 512, rd = 0, tr = 0, ss = 67108864, bp = 4
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Started mcs-primproc.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Running start on mcs-exemgr. With sudo False.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root start running systemctl start mcs-exemgr
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Started mcs-exemgr.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Running start on mcs-writeengineserver. With sudo False.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root start running systemctl start mcs-writeengineserver
|
Jul 30 21:32:59 mcs-ubuntu2004-2 joblist[9617]: 59.890644 |0|0|0| W 05 CAL0000: /home/jenkins/workspace/MariaDBE-Custom-DEB/label/ubuntu-2004/MariaDBEnterprise/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 299 Could not connect to PMS1: Connection refused
|
Jul 30 21:32:59 mcs-ubuntu2004-2 env[9617]: Could not connect to PMS1: Connection refused
|
Jul 30 21:32:59 mcs-ubuntu2004-2 joblist[9617]: 59.891481 |0|0|0| W 05 CAL0000: /home/jenkins/workspace/MariaDBE-Custom-DEB/label/ubuntu-2004/MariaDBEnterprise/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 299 Could not connect to PMS3: Connection refused
|
Jul 30 21:32:59 mcs-ubuntu2004-2 env[9617]: Could not connect to PMS3: Connection refused
|
Jul 30 21:32:59 mcs-ubuntu2004-2 env[9617]: Starting ExeMgr: st = 50, qs = 20, mx = 95, cf = /etc/columnstore/Columnstore.xml
|
Jul 30 21:32:59 mcs-ubuntu2004-2 messagequeue[9617]: 59.893018 |0|0|0| E 31 CAL0000: MessageQueueClient::setup(): Temporary failure in name resolution
|
Jul 30 21:32:59 mcs-ubuntu2004-2 env[9617]: DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Address family not supported by protocol to: InetStreamSocket: sd: 12 inet: 77.127.0.0 port: 53263
|
Jul 30 21:32:59 mcs-ubuntu2004-2 env[9617]: DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Address family not supported by protocol to: InetStreamSocket: sd: 12 inet: 77.127.0.0 port: 53263
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Started WriteEngineServer.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 messagequeue[9617]: 59.898371 |0|0|0| E 31 CAL0000: MessageQueueClient::setup(): Temporary failure in name resolution
|
Jul 30 21:32:59 mcs-ubuntu2004-2 controllernode[9617]: 59.898563 |0|0|0| E 29 CAL0000: DBRM: error: SessionManager::setSystemState() failed (network)
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Trying...
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020:21:32:59] HTTP
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: Traceback (most recent call last):
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cprequest.py", line 638, in respond
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: self._do_respond(path_info)
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cprequest.py", line 697, in _do_respond
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: response.body = self.handler()
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/lib/encoding.py", line 219, in __call__
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: self.body = self.oldhandler(*args, **kwargs)
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/lib/jsontools.py", line 59, in json_handler
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: value = cherrypy.serving.request._json_inner_handler(*args, **kwargs)
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cpdispatch.py", line 54, in __call__
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: return self.callable(*self.args, **self.kwargs)
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/cmapi_server/controllers/endpoints.py", line 282, in put_config
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: msgs = list(os_operations.apply(actions, **kwargs))
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/mcs_node_control/models/os_operations.py", line 99, in apply
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: int(controllernode['Port'])))
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: socket.gaierror: [Errno -3] Temporary failure in name resolution
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020:21:32:59] HTTP
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: Request Headers:
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: Remote-Addr: 10.10.10.10
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: HOST: mcs2:8640
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: USER-AGENT: python-requests/2.23.0
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: ACCEPT-ENCODING: gzip, deflate
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: ACCEPT: */*
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: CONNECTION: keep-alive
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: X-API-KEY: '93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd'
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: Content-Type: application/json
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: Content-Length: 20080
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:32:59] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 500 513 "" "python-requests/2.23.0"
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020 21:32:59] cmapi_server DEBUG put_config starts
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020 21:32:59] cmapi_server DEBUG put_config JSON body {'manager': 'mcs-ubuntu2004-1', 'revision': '1', 'timeout': 300, 'config': '<?xml version="1.0" ?>\n<Columnstore Version="V1.0.0">\n <!--\n\tWARNING: Do not make changes to this file unless directed to do so by\n\tMariaDB service engineers. Incorrect settings can render your system\n\tunusable and will require a service call to correct.\n-->\n <ExeMgr1>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8601</Port>\n <Module>unassigned</Module>\n </ExeMgr1>\n <JobProc>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8602</Port>\n </JobProc>\n <ProcMgr>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8603</Port>\n </ProcMgr>\n <ProcMgr_Alarm>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8606</Port>\n </ProcMgr_Alarm>\n <ProcStatusControl>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8604</Port>\n </ProcStatusControl>\n <ProcStatusControlStandby>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8605</Port>\n </ProcStatusControlStandby>\n <!-- Disabled\n\t<ProcHeartbeatControl>\n\t\t<IPAddr>0.0.0.0</IPAddr>\n\t\t<Port>8605</Port>\n\t</ProcHeartbeatControl>\n\t-->\n <!-- ProcessMonitor Port: 8800 - 8820 is reserved to support External Modules-->\n <localhost_ProcessMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8800</Port>\n </localhost_ProcessMonitor>\n <dm1_ProcessMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8800</Port>\n </dm1_ProcessMonitor>\n <um1_ProcessMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8800</Port>\n </um1_ProcessMonitor>\n <pm1_ProcessMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8800</Port>\n </pm1_ProcessMonitor>\n <dm1_ServerMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8622</Port>\n </dm1_ServerMonitor>\n <um1_ServerMonitor>\n <IPAddr>0.0.0.0</IPAddr>\n <Port>8622</Port>\n </um1_ServerMonitor>\n <pm1_ServerMonitor>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8622</Port>\n </pm1_ServerMonitor>\n <DDLProc>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8612</Port>\n </DDLProc>\n <DMLProc>\n <IPAddr>mcs-ubuntu2004-1</IPAddr>\n <Port>8614</Port>\n </DMLProc>\n <BatchInsert>\n <RowsPerBatch>10000</RowsPerBatch>\n </BatchInsert>\n <PrimitiveServers>\n <Count>2</Count>\n <ConnectionsPerPrimProc>2</ConnectionsPerPrimProc>\n <ProcessorThreshold>128</ProcessorThreshold>\n <ProcessorQueueSize>10K</ProcessorQueueSize>\n <!-- minimum of extent size 8192 -->\n <DebugLevel>0</DebugLevel>\n <LBID_Shift>13</LBID_Shift>\n <ColScanBufferSizeBlocks>512</ColScanBufferSizeBlocks>\n <ColScanReadAheadBlocks>512</ColScanReadAheadBlocks>\n <!-- s/b factor of extent size 8192 -->\n <!-- <BPPCount>16</BPPCount> -->\n <!-- Default num cores * 2. A cap on the number of simultaneous primitives per jobstep -->\n <PrefetchThreshold>1</PrefetchThreshold>\n <PTTrace>0</PTTrace>\n <RotatingDestination>n</RotatingDestination>\n <!-- Iterate thru UM ports; set to \'n\' if UM/PM on same server -->\n <!-- <HighPriorityPercentage>60</HighPriorityPercentage> -->\n <!-- <MediumPriorityPercentage>30</MediumPriorityPercentage> -->\n <!-- <LowPriorityPercentage>10</LowPriorityPercentage> -->\n <DirectIO>y</DirectIO>\n <HighPriorityPercentage/>\n <MediumPriorityPercentage/>\n <LowPriorityPercentage/>\n </PrimitiveServers>\n <SystemConfig>\n <SystemName>columnstore-1</SystemName>\n <ParentOAMModuleName>pm1</ParentOAMModuleName>\n <StandbyOAMModuleName>unassigned</StandbyOAMModuleName>\n <PrimaryUMModuleName>pm1</PrimaryUMModuleName>\n <ModuleHeartbeatPeriod>1</ModuleHeartbeatPeriod>\n <ModuleHeartbeatCount>3</ModuleHeartbeatCount>\n <ModuleProcMonWaitCount>12</ModuleProcMonWaitCount>\n \t// 2.5 minutes\n <!-- Disabled\n\t\t<ProcessHeartbeatPeriod>-1</ProcessHeartbeatPeriod>\n\t\t-->\n <!-- Warning: Do not change this value once database is built -->\n <DBRootCount>2</DBRootCount>\n <DBRoot1>/var/lib/columnstore/data1</DBRoot1>\n <DBRMRoot>/var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves</DBRMRoot>\n <TableLockSaveFile>/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks</TableLockSaveFile>\n <DBRMTimeOut>15</DBRMTimeOut>\n <!-- in seconds -->\n <DBRMSnapshotInterval>100000</DBRMSnapshotInterval>\n <ExternalCriticalThreshold>90</ExternalCriticalThreshold>\n <ExternalMajorThreshold>80</ExternalMajorThreshold>\n <ExternalMinorThreshold>70</ExternalMinorThreshold>\n <!-- <TempDiskPath>/tmp</TempDiskPath>\n\t\t<WorkingDir>/tmp</WorkingDir>\n\t\t<TempFileDir>/tmp/columnstore_tmp_files</TempFileDir>\n\t\t-->\n <TransactionArchivePeriod>10</TransactionArchivePeriod>\n <NMSIPAddress>0.0.0.0</NMSIPAddress>\n <TempSaveSize>128M</TempSaveSize>\n <!-- default SWSDL max element save size -->\n <WaitPeriod>10</WaitPeriod>\n <!-- in seconds -->\n <ProcessRestartCount>10</ProcessRestartCount>\n <ProcessRestartPeriod>120</ProcessRestartPeriod>\n <SwapAction>restartSystem</SwapAction>\n <!-- OAM command (or \'none\') to run when swap space exceeds Major Threshold -->\n <ActivePmFailoverDisabled>n</ActivePmFailoverDisabled>\n <MemoryCheckPercent>95</MemoryCheckPercent>\n <!-- Max real memory to limit growth of buffers to -->\n <DataFileLog>OFF</DataFileLog>\n <!-- enable if you want to limit how much memory may be used for hdfs read/write memory buffers.\n \t\t<hdfsRdwrBufferMaxSize>8G</hdfsRdwrBufferMaxSize>\n\t\t-->\n <hdfsRdwrScratch>/rdwrscratch</hdfsRdwrScratch>\n <!-- Do not set to an hdfs file path -->\n <TempFileDir>/columnstore_tmp_files</TempFileDir>\n <SystemTempFileDir>/tmp/columnstore_tmp_files</SystemTempFileDir>\n <DBRoot2>/var/lib/columnstore/data2</DBRoot2>\n </SystemConfig>\n <SystemModuleConfig>\n <ModuleType1>dm</ModuleType1>\n <ModuleDesc1>Director Module</ModuleDesc1>\n <RunType1>SIMPLEX</RunType1>\n <ModuleCount1>0</ModuleCount1>\n <ModuleIPAddr1-1-1>0.0.0.0</ModuleIPAddr1-1-1>\n <ModuleHostName1-1-1>unassigned</ModuleHostName1-1-1>\n <ModuleDisableState1-1>ENABLED</ModuleDisableState1-1>\n <ModuleCPUCriticalThreshold1>0</ModuleCPUCriticalThreshold1>\n <ModuleCPUMajorThreshold1>0</ModuleCPUMajorThreshold1>\n <ModuleCPUMinorThreshold1>0</ModuleCPUMinorThreshold1>\n <ModuleCPUMinorClearThreshold1>0</ModuleCPUMinorClearThreshold1>\n <ModuleDiskCriticalThreshold1>90</ModuleDiskCriticalThreshold1>\n <ModuleDiskMajorThreshold1>80</ModuleDiskMajorThreshold1>\n <ModuleDiskMinorThreshold1>70</ModuleDiskMinorThreshold1>\n <ModuleMemCriticalThreshold1>90</ModuleMemCriticalThreshold1>\n <ModuleMemMajorThreshold1>0</ModuleMemMajorThreshold1>\n <ModuleMemMinorThreshold1>0</ModuleMemMinorThreshold1>\n <ModuleSwapCriticalThreshold1>90</ModuleSwapCriticalThreshold1>\n <ModuleSwapMajorThreshold1>80</ModuleSwapMajorThreshold1>\n <ModuleSwapMinorThreshold1>70</ModuleSwapMinorThreshold1>\n <ModuleDiskMonitorFileSystem1-1>/</ModuleDiskMonitorFileSystem1-1>\n <ModuleDBRootCount1-1>unassigned</ModuleDBRootCount1-1>\n <ModuleDBRootID1-1-1>unassigned</ModuleDBRootID1-1-1>\n <ModuleType2>um</ModuleType2>\n <ModuleDesc2>User Module</ModuleDesc2>\n <RunType2>SIMPLEX</RunType2>\n <ModuleCount2>0</ModuleCount2>\n <ModuleIPAddr1-1-2>0.0.0.0</ModuleIPAddr1-1-2>\n <ModuleHostName1-1-2>unassigned</ModuleHostName1-1-2>\n <ModuleDisa
|
Jul 30 21:32:59 mcs-ubuntu2004-2 env[9635]: WriteEngineServer is ready
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root -- Matching against ModuleIPAddr2-1-3, which says 10.10.10.11
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root Wrote 'pm2' to /var/lib/columnstore/local/module
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Running stop on mcs-dmlproc. With sudo False.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root stop running systemctl stop mcs-dmlproc
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Running stop on mcs-ddlproc. With sudo False.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root stop running systemctl stop mcs-ddlproc
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Running stop on mcs-primproc. With sudo False.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root stop running systemctl stop mcs-primproc
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Stopping WriteEngineServer...
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: mcs-writeengineserver.service: Succeeded.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Stopped WriteEngineServer.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-exemgr...
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: mcs-exemgr.service: Succeeded.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-exemgr.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-primproc...
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: mcs-primproc.service: Succeeded.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-primproc.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Running stop on mcs-writeengineserver. With sudo False.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root stop running systemctl stop mcs-writeengineserver
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root apply Running stop on mcs-exemgr. With sudo False.
|
Jul 30 21:32:59 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:32:59] root stop running systemctl stop mcs-exemgr
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root apply Running stop on mcs-controllernode. With sudo False.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root stop running systemctl stop mcs-controllernode
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root apply Running stop on mcs-workernode. With sudo False.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root stop running systemctl stop mcs-workernode
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: Stopping mcs-workernode...
|
Jul 30 21:33:00 mcs-ubuntu2004-2 controllernode[9684]: 00.055913 |0|0|0| C 29 CAL0000: ExtentMap::save(): got request to save an empty BRM
|
Jul 30 21:33:00 mcs-ubuntu2004-2 save_brm[9684]: ExtentMap::save(): got request to save an empty BRM
|
Jul 30 21:33:00 mcs-ubuntu2004-2 save_brm[9684]: Save failed
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: mcs-workernode.service: Succeeded.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: Stopped mcs-workernode.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root apply Running stop on mcs-storagemanager. With sudo False.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root stop running systemctl stop mcs-storagemanager
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root apply Running start on mcs-workernode. With sudo False.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root start running systemctl start mcs-workernode
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: Starting loadbrm...
|
Jul 30 21:33:00 mcs-ubuntu2004-2 env[9705]: Failed to load meta data from the primary node mcs-ubuntu2004-1.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 env[9705]: Pulling em from the primary node.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: mcs-loadbrm.service: Main process exited, code=exited, status=1/FAILURE
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: mcs-loadbrm.service: Failed with result 'exit-code'.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: Failed to start loadbrm.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: Started mcs-workernode.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root apply Running start on mcs-primproc. With sudo False.
|
Jul 30 21:33:00 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:00] root start running systemctl start mcs-primproc
|
Jul 30 21:33:00 mcs-ubuntu2004-2 systemd[1]: Starting mcs-primproc...
|
Jul 30 21:33:00 mcs-ubuntu2004-2 env[9718]: Starting PrimitiveServer: st = 1, sq = 10, pw = 128, pq = 10240, nb = 62789, nt = 4, nc = 1, ra = 512, db = 128, mb = 512, rd = 0, tr = 0, ss = 67108864, bp = 4
|
Jul 30 21:33:02 mcs-ubuntu2004-2 systemd[1]: Started mcs-primproc.
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:02] root apply Running start on mcs-exemgr. With sudo False.
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:02] root start running systemctl start mcs-exemgr
|
Jul 30 21:33:02 mcs-ubuntu2004-2 systemd[1]: Started mcs-exemgr.
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:02] root apply Running start on mcs-writeengineserver. With sudo False.
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:02] root start running systemctl start mcs-writeengineserver
|
Jul 30 21:33:02 mcs-ubuntu2004-2 systemd[1]: Started WriteEngineServer.
|
Jul 30 21:33:02 mcs-ubuntu2004-2 env[9750]: Starting ExeMgr: st = 50, qs = 20, mx = 95, cf = /etc/columnstore/Columnstore.xml
|
Jul 30 21:33:02 mcs-ubuntu2004-2 env[9750]: DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Address family not supported by protocol to: InetStreamSocket: sd: 14 inet: 143.127.0.0 port: 53426
|
Jul 30 21:33:02 mcs-ubuntu2004-2 messagequeue[9750]: 02.662094 |0|0|0| E 31 CAL0000: MessageQueueClient::setup(): Temporary failure in name resolution
|
Jul 30 21:33:02 mcs-ubuntu2004-2 env[9750]: DBRM::send_recv caught: InetStreamSocket::connect: connect() error: Address family not supported by protocol to: InetStreamSocket: sd: 14 inet: 143.127.0.0 port: 53426
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:02] root apply Waiting for controllernode to come up before starting ddlproc/dmlproc on non-primary nodes.
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020 21:33:02] root apply Trying...
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020:21:33:02] HTTP
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: Traceback (most recent call last):
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cprequest.py", line 638, in respond
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: self._do_respond(path_info)
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cprequest.py", line 697, in _do_respond
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: response.body = self.handler()
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/lib/encoding.py", line 219, in __call__
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: self.body = self.oldhandler(*args, **kwargs)
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/lib/jsontools.py", line 59, in json_handler
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: value = cherrypy.serving.request._json_inner_handler(*args, **kwargs)
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/deps/cherrypy/_cpdispatch.py", line 54, in __call__
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: return self.callable(*self.args, **self.kwargs)
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/cmapi_server/controllers/endpoints.py", line 282, in put_config
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: msgs = list(os_operations.apply(actions, **kwargs))
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: File "/opt/cmapi/mcs_node_control/models/os_operations.py", line 99, in apply
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: int(controllernode['Port'])))
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: socket.gaierror: [Errno -3] Temporary failure in name resolution
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: [30/Jul/2020:21:33:02] HTTP
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: Request Headers:
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: Remote-Addr: 10.10.10.10
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: HOST: mcs2:8640
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: USER-AGENT: python-requests/2.23.0
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: ACCEPT-ENCODING: gzip, deflate
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: ACCEPT: */*
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: CONNECTION: keep-alive
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: X-API-KEY: '93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd'
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: Content-Type: application/json
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: Content-Length: 20080
|
Jul 30 21:33:02 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:33:02] "PUT /cmapi/0.4.0/node/config HTTP/1.1" 500 513 "" "python-requests/2.23.0"
|
Jul 30 21:33:02 mcs-ubuntu2004-2 messagequeue[9750]: 02.662373 |0|0|0| E 31 CAL0000: MessageQueueClient::setup(): Temporary failure in name resolution
|
Jul 30 21:33:02 mcs-ubuntu2004-2 controllernode[9750]: 02.662423 |0|0|0| E 29 CAL0000: DBRM: error: SessionManager::setSystemState() failed (network)
|
Jul 30 21:33:02 mcs-ubuntu2004-2 env[9762]: WriteEngineServer is ready
|
Jul 30 21:33:03 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:33:03] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 200 43 "" "python-requests/2.23.0"
|
Jul 30 21:33:08 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:33:08] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:09 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:33:09] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:10 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:33:10] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:11 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:33:11] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
Jul 30 21:33:12 mcs-ubuntu2004-2 python3[7938]: 10.10.10.10 - - [30/Jul/2020:21:33:12] "PUT /cmapi/0.4.0/node/rollback HTTP/1.1" 422 38 "" "python-requests/2.23.0"
|
The CMAPI Server is apparently not able to handle systems that have multiple host names. Each of these servers has two host names.
For the primary:
- The name in /etc/hostname is mcs-ubuntu2004-1
- The name associated with the IP address in /etc/hosts is mcs1.
- The other nodes can only resolve the mcs1 host name.
- This node was added to the cluster using the mcs1 host name.
For the replica:
- The name in /etc/hostname is mcs-ubuntu2004-2
- The name associated with the IP address in /etc/hosts is mcs2.
- The other nodes can only resolve the mcs2 host name.
- This node was added to the cluster using the mcs2 host name.
Since the primary node was added to the cluster using the mcs1 host name, you would think that the node's Columnstore.xml should use that host name. This does not happen. Somehow, the host started using the bad mcs-ubuntu2004-1 host name in Columnstore.xml. For example:
<ExeMgr1> |
<IPAddr>mcs-ubuntu2004-1</IPAddr> |
<Port>8601</Port> |
<Module>unassigned</Module> |
</ExeMgr1> |
<JobProc> |
<IPAddr>0.0.0.0</IPAddr> |
<Port>8602</Port> |
</JobProc> |
<ProcMgr> |
<IPAddr>mcs-ubuntu2004-1</IPAddr> |
<Port>8603</Port> |
</ProcMgr> |
<ProcMgr_Alarm> |
<IPAddr>mcs-ubuntu2004-1</IPAddr> |
<Port>8606</Port> |
</ProcMgr_Alarm> |
<ProcStatusControl> |
<IPAddr>mcs-ubuntu2004-1</IPAddr> |
<Port>8604</Port> |
</ProcStatusControl> |
So I had to fix this by doing the following:
1.) I executed the following commands on each node to clear the bad configuration:
$ sudo systemctl stop mariadb-columnstore
|
$ sudo systemctl stop mariadb-columnstore-cmapi
|
$ sudo cp /etc/columnstore/Columnstore.xml.columnstoreSave /etc/columnstore/Columnstore.xml
|
2.) And then on each node, I also had to change the host name in /etc/hostname to be the one that every other node can resolve.
3.) And then I had to reboot each node.
4.) And then I had to re-add each node to the cluster.
This seemed to stop the DNS-related issues.
The CMAPI Server should probably be smarter about how it handles nodes with multiple host names. If a given node is added to the cluster using a specific host name, and the CMAPI server notices that the node also has a different host name, then it should only write the host name to Columnstore.xml that the user specified in the add-node command.
Attachments
Issue Links
- relates to
-
MCOL-4224 Improve error message: "Failed to open file: /var/lib/columnstore/data1/systemFiles/dbrm/tablelocks"
- Closed