Future improvement possibilities for storage-manager
(MCOL-3449)
|
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ? |
| Affects Version/s: | None |
| Fix Version/s: | N/A |
| Type: | Sub-Task | Priority: | Major |
| Reporter: | Patrick LeBlanc (Inactive) | Assignee: | Leonid Fedorov |
| Resolution: | Won't Fix | Votes: | 1 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Description |
|
Data is currently moved around as-is. Might be a good idea to verify the integrity of uploads and downloads. iirc, S3 returns the md5 of an object on a HEAD op. It would likely slow things down a little, so maybe make this an option rather than a requirement. Update: we ran into a problem with data integrity after an SM crash; see For 1, we can write to tmp files, then move it to the right location once done so that another SM instance would only see completed writes. For 2, we can add a checksum to the metadata entries and journal entries, and verify on read. These are only initial thoughts, there may be better options. |
| Comments |
| Comment by Patrick LeBlanc (Inactive) [ 2020-05-21 ] |
|
I am fixing things related to this, which were the cause of a crash in the sky environment. It won't completely solve the general problem, but it's a related step forward. |
| Comment by Patrick LeBlanc (Inactive) [ 2020-05-21 ] |
|
I re-read the description here, and I do have a better plan in mind that doesn't involve scanning and checksumming, and should guarantee the data visible to reads was written correctly. Won't be resistant to storage-level corruption or hacking, etc, but that is a whole other level of data integrity; probably not in our scope of responsibility (yet). |