Future improvement possibilities for storage-manager (MCOL-3449)

[MCOL-3459] think about data integrity (partial data problem) Created: 2019-08-26  Updated: 2023-10-25  Resolved: 2023-10-25

Status: Closed
Project: MariaDB ColumnStore
Component/s: ?
Affects Version/s: None
Fix Version/s: N/A

Type: Sub-Task Priority: Major
Reporter: Patrick LeBlanc (Inactive) Assignee: Leonid Fedorov
Resolution: Won't Fix Votes: 1
Labels: None

Issue Links:
PartOf
includes MCOL-4021 rollback causing storagemanager to crash Closed
Relates
relates to MCOL-3711 Columnstore System Status remains Act... Closed

 Description   

Data is currently moved around as-is. Might be a good idea to verify the integrity of uploads and downloads. iirc, S3 returns the md5 of an object on a HEAD op. It would likely slow things down a little, so maybe make this an option rather than a requirement.

Update: we ran into a problem with data integrity after an SM crash; see MCOL-3711. Two tasks come from this. 1) SM needs to be robust against crashes during a write, and 2) it has to do the right thing when reading corrupted data.

For 1, we can write to tmp files, then move it to the right location once done so that another SM instance would only see completed writes. For 2, we can add a checksum to the metadata entries and journal entries, and verify on read. These are only initial thoughts, there may be better options.



 Comments   
Comment by Patrick LeBlanc (Inactive) [ 2020-05-21 ]

I am fixing things related to this, which were the cause of a crash in the sky environment. It won't completely solve the general problem, but it's a related step forward.

Comment by Patrick LeBlanc (Inactive) [ 2020-05-21 ]

I re-read the description here, and I do have a better plan in mind that doesn't involve scanning and checksumming, and should guarantee the data visible to reads was written correctly. Won't be resistant to storage-level corruption or hacking, etc, but that is a whole other level of data integrity; probably not in our scope of responsibility (yet).

Generated at Thu Feb 08 02:42:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.