[MCOL-3572] Extent Map must have separate linear arrays for different column widths - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: Icebox
Component/s: PrimProc
Labels:
None

Description

Extent Map is now a linear structure that doesn't look appropriate for large data sets.
The suggested design is to break a single linear array of EMEtries into 5 segments that corresponds to 1, 2, 4, 8, 16 bytes.
ExtentMap will have two additional structures(presumably hashmaps) that will be used to pick an appropriate linear array to work with. First will map OID+partition+segment to width and the second LBID to width. Most of the current EM methods must become dispatchers that finds the width out and calls the apropriate templated method spec that in their turn will contain the logic from the current methods.

The patch should take care of:

updating existing on-disk extent map layout on the first load
storing an updated ExtentMap on disk
existing methods changes

The disk layout of ExtentMap will change. Here is the current layout:

   "header":{

      "version":"int",

      "numberOfEMEntries":"int",

      "useless":"int"

},

   "ExtentMapEntries":[

         "EMEntry1":"string"

},

         "EMEntry2":"string"

},

         "EMEntry3":"string"

The suggested layout will be:

   "header":{

      "version":"int",

      "offsets":[

            "width_offset_type":"int",

            "width":"int",

            "offset":"int"

},

            "width_offset_type":"int",

            "width":"int",

            "offset":"int"

},

            "width_offset_type":"int",

            "width":"int",

            "offset":"int"

},

            "width_offset_type":"int",

            "width":"int",

            "offset":"int"

],

      "data":[

            "WidthSpecificExtentMapEntries":[

                  "EMEntry1":"binary"

},

                  "EMEntry2":"binary"

},

                  "EMEntry3":"binary"

},

            "WidthSpecificExtentMapEntries":[

                  "EMEntry1":"binary"

},

                  "EMEntry2":"binary"

},

                  "EMEntry3":"binary"

},

            "WidthSpecificExtentMapEntries":[

                  "EMEntry1":"binary"

},

                  "EMEntry2":"binary"

},

                  "EMEntry3":"binary"

The suggested change to Casual Partitioning structure is:

struct EMCasualPartition_struct

    int32_t sequenceNum;

    char isValid; //CP_INVALID - No min/max and no DML in progress. CP_UPDATING - Update in progress. CP_VALID- min/max is valid

    uint8_t keyLen; // up to 256 bytes

    uint8_t minMax[]; // memcmp-friendly (big-endian for integers) encoding of min and max prefixes.

                                 // The array is twice the keyLen field long, first goes min, then max.

    // here we need methods to compute record size, get the min/max keys, etc.

};

Attachments

Issue Links

is part of

MCOL-4343 umbrella for tech debt issues

Open

Activity

People

Assignee:: Todd Stoffel (Inactive)

Reporter:: Andrew Hutchings (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2019-10-24 12:41

Updated:: 2022-11-18 14:33

Resolved:: 2022-11-18 14:33

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.