Details
-
New Feature
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Won't Fix
-
None
-
None
Description
Extent Map is now a linear structure that doesn't look appropriate for large data sets.
The suggested design is to break a single linear array of EMEtries into 5 segments that corresponds to 1, 2, 4, 8, 16 bytes.
ExtentMap will have two additional structures(presumably hashmaps) that will be used to pick an appropriate linear array to work with. First will map OID+partition+segment to width and the second LBID to width. Most of the current EM methods must become dispatchers that finds the width out and calls the apropriate templated method spec that in their turn will contain the logic from the current methods.
The patch should take care of:
- updating existing on-disk extent map layout on the first load
- storing an updated ExtentMap on disk
- existing methods changes
The disk layout of ExtentMap will change. Here is the current layout:
{
|
"header":{
|
"version":"int",
|
"numberOfEMEntries":"int",
|
"useless":"int"
|
},
|
"ExtentMapEntries":[
|
{
|
"EMEntry1":"string"
|
},
|
{
|
"EMEntry2":"string"
|
},
|
{
|
"EMEntry3":"string"
|
}
|
]
|
}
|
The suggested layout will be:
{
|
"header":{
|
"version":"int",
|
"offsets":[
|
{
|
"width_offset_type":"int",
|
"width":"int",
|
"offset":"int"
|
},
|
{
|
"width_offset_type":"int",
|
"width":"int",
|
"offset":"int"
|
},
|
{
|
"width_offset_type":"int",
|
"width":"int",
|
"offset":"int"
|
},
|
{
|
"width_offset_type":"int",
|
"width":"int",
|
"offset":"int"
|
}
|
],
|
"data":[
|
{
|
"WidthSpecificExtentMapEntries":[
|
{
|
"EMEntry1":"binary"
|
},
|
{
|
"EMEntry2":"binary"
|
},
|
{
|
"EMEntry3":"binary"
|
}
|
]
|
},
|
{
|
"WidthSpecificExtentMapEntries":[
|
{
|
"EMEntry1":"binary"
|
},
|
{
|
"EMEntry2":"binary"
|
},
|
{
|
"EMEntry3":"binary"
|
}
|
]
|
},
|
{
|
"WidthSpecificExtentMapEntries":[
|
{
|
"EMEntry1":"binary"
|
},
|
{
|
"EMEntry2":"binary"
|
},
|
{
|
"EMEntry3":"binary"
|
}
|
]
|
}
|
]
|
}
|
The suggested change to Casual Partitioning structure is:
struct EMCasualPartition_struct |
{
|
int32_t sequenceNum;
|
char isValid; //CP_INVALID - No min/max and no DML in progress. CP_UPDATING - Update in progress. CP_VALID- min/max is valid |
uint8_t keyLen; // up to 256 bytes |
uint8_t minMax[]; // memcmp-friendly (big-endian for integers) encoding of min and max prefixes. |
// The array is twice the keyLen field long, first goes min, then max. |
|
// here we need methods to compute record size, get the min/max keys, etc. |
};
|
Attachments
Issue Links
- is part of
-
MCOL-4343 umbrella for tech debt issues
- Open