[MCOL-5058] CMAPI and local smcat runs can access Storage Manager too early causing assertion in SM runtime Created: 2022-04-18  Updated: 2022-06-27

Status: Open
Project: MariaDB ColumnStore
Component/s: cmapi, Storage Manager
Affects Version/s: 5.6.5, 6.2.3
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Roman Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Consider a part of a startup procedure for S3-based installation [1]. CMAPI that gets cluster/start REST call initiates node/start calls at all nodes. node/start in its turn starts with mcs-workernode@1 | mcs-workernode@2. The last two units initiate mcs-loadbrm systemd unit startup that in its turn initiates its local SM running systemctl start mcs-storagemanager. There is a period when SM doesn't fill up its internal structure prefix cache[2] yet when SM bootstraps itself. SM throws an assert exception [3] if SM request[4] comes when prefix cache isn't yet filled up. This failure causes mcs-workernode@

{1,2}

units to fail [5]. The most severe issue is that non-primary nodes might look like they are OK but they have a reduced and corrupted extent maps in /dev/shm so that any extent map write operation distributed by the controllernode will set the cluster into read-only.

Together with Alan we introduced an explicit delays b/w SM and actual extent map image load at the customer's site. However this workaround can't be used as an appropriate long-term solution. IMHO there are two long-term solution options:

  • SM shouldn't assert at this point but return an error so that the above layers, e.g. smcat, CMAPI that calls smcat are notified and retries.
  • CMAPI must wait until primary node workernode is available.

The second approach doesn't solve the issues with ahead-of-time local smcat runs though so the first one looks more appropriate.

1. Here I consider systemd startup however the logic is the same for non-systemd container startup.
2. Prefix cache is the set of dir paths, e.g. data1/systemFiles/dbrm/ that SM owns/processes request for.
3. Apr 13 16:58:42 nvmesh-target-c env[3222951]: StorageManager: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/storage-manager/src/Cache.cpp:300: storagemanager::PrefixCache& storagemanager::Cache::getPCache(const boost::filesystem::path&): Assertion `it != prefixCaches.end()' failed.
Apr 13 16:58:42 nvmesh-target-c systemd[1]: mcs-storagemanager.service: Main process exited, code=killed, status=6/ABRT
Apr 13 16:58:46 nvmesh-target-c systemd[1]: mcs-storagemanager.service: Failed with result 'signal'.
4. Local nodes smcat runs or remote-nodes that ask for meta/

{em, vbbm, vss, journal}

CMAPI REST endpoints.
5. Apr 13 16:17:16 nvmesh-target-c workernode[3195838]: SocketPool::getSocket() failed to connect; got 'Connection refused'
Apr 13 16:17:16 nvmesh-target-c workernode[3195838]: configcpp[3195851]: 16.594847 |0|0|0| E 12 SocketPool::getSocket() failed to connect; got 'Connection refused'
Apr 13 16:17:16 nvmesh-target-c configcpp[3195851]: 16.594847 |0|0|0| E 12 SocketPool::getSocket() failed to connect; got 'Connection refused'


Generated at Thu Feb 08 02:55:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.