[MCOL-5338] BRM_savesB_journal not found -timeout ? Created: 2022-12-07  Updated: 2024-02-05

Status: Stalled
Project: MariaDB ColumnStore
Component/s: cmapi
Affects Version/s: 6.1.1
Fix Version/s: 23.10

Type: Bug Priority: Blocker
Reporter: Allen Herrera Assignee: Alan Mologorsky
Resolution: Unresolved Votes: 0
Labels: mcs_trg, rm_stability, triage

Sprint: 2023-12

 Description   

Columnstore has internal metadata deltas stored in a file called BRM_saves_journal.
By default every 100K BRM changes a new BRM_saves version is saved. Starting with BRM_savesA then BRM_savesB. The current version is tracked in BRM_saves_current.

The problem occurs when the system BRM_saves_current=BRM_savesB and the system inproperly shutdown or cant complete. On startup, the system complains

Dec  6 19:44:02 ip-10-111-0-54 python3: Error reading data1/systemFiles/dbrm/BRM_savesB_journal: No such file or directoryError reading
Dec  6 19:44:02 ip-10-111-0-54 python3: data1/systemFiles/dbrm/BRM_savesB_journal: No such file or directory



 Comments   
Comment by Daniel Lee (Inactive) [ 2022-12-08 ]

Assigning it to David.Hall for engineer assignment and schedule

Comment by Alan Mologorsky [ 2022-12-09 ]

alexey.vorovich
Here it is the guide to manually increase timeout:

  1. stop the cluster if running
  2. change the TimeoutStopSec value to 900 in /usr/lib/systemd/system/mcs-workernode@.service
  3. run sudo systemctl daemon-reload --all
  4. check the timeout changed by running command sudo systemctl show mcs-workernode@1.service -p TimeoutStopUSec
  5. start cluster
Generated at Thu Feb 08 02:57:07 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.