Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4791

Fix ColumnCommand fudged data type format to clearly identify CHAR vs VARCHAR

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Stalled (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 6.1.1
    • Fix Version/s: 6.3.1
    • Component/s: ExeMgr, PrimProc
    • Labels:
      None

      Description

      Under terms of MCOL-4691 we're going to replace the 0-terminated representation of the RowGroup VARCHAR format for short columns and replace it to:

      • One byte length
      • Followed by the actual string data

      This will remove a lot of strnlen() calls used e.g. in the row aggregation code.

      On order to do the format change easier we need to clearly distinguish CHAR vs VARCHAR on the PrimProc side.

      Currently it's not possible to distinguish because ExeProc sends the data type in ColumnCommand in a "fudged" format as follows:

      ExeMgr Real Type PrimProc Fudged Type  PrimProc isDict
      ---------------- --------------------  ---------------
      VARCHAR(1)       VARCHAR(2)            false
      VARCHAR(2)       VARCHAR(4)            false
      VARCHAR(3)       VARCHAR(4)            false
      VARCHAR(4)       CHAR(8)               false
      VARCHAR(5)       CHAR(8)               false
      VARCHAR(6)       CHAR(8)               false
      VARCHAR(7)       CHAR(8)               false
      VARCHAR(8)       VARCHAR(8)            true
      VARCHAR(9)       VARCHAR(8)            true
      VARCHAR(255)     VARCHAR(8)            true
      VARCHAR(8000)    VARCHAR(8)            true
       
      CHAR(1)          CHAR(1)               false
      CHAR(2)          CHAR(2)               false
      CHAR(3)          CHAR(4)               false
      CHAR(4)          CHAR(4)               false
      CHAR(5)          CHAR(8)               false
      CHAR(6)          CHAR(8)               false
      CHAR(7)          CHAR(8)               false
      CHAR(8)          CHAR(8)               false
      CHAR(9)          VARCHAR(8)            true
      CHAR(255)        VARCHAR(8)            true
      

      The current notation uses VARCHAR(8) to mean "a CHAR or VARCHAR dictionary column", no matter what the original data type is (CHAR or VARCHAR).
      Additionally, some tweaks happen when sending VARCHAR(4)..VARCHAR(7). PrimProc sees them as CHAR(8).

      Under terms of this task we'll change the code as follows:

      • PrimProc we'll see the exact ExeMgr side data type: true CHAR or true VARCHAR.
      • isDict will be serialized and deserialized (currently it's detected on the PrimProc side by testing the data type against VARCHAR(8)).

      The new fudged data type mapping will look as follows:

      ExeMgr Real Type PrimProc Fudged Type  PrimProc isDict
      ---------------- --------------------  ---------------
      VARCHAR(1)       VARCHAR(2)            false
      VARCHAR(2)       VARCHAR(4)            false
      VARCHAR(3)       VARCHAR(4)            false
      VARCHAR(4)       VARCHAR(8)            false
      VARCHAR(5)       VARCHAR(8)            false
      VARCHAR(6)       VARCHAR(8)            false
      VARCHAR(7)       VARCHAR(8)            false
      VARCHAR(8)       VARCHAR(8)            true
      VARCHAR(9)       VARCHAR(8)            true
      VARCHAR(255)     VARCHAR(8)            true
      VARCHAR(8000)    VARCHAR(8)            true
       
      CHAR(1)          CHAR(1)               false
      CHAR(2)          CHAR(2)               false
      CHAR(3)          CHAR(4)               false
      CHAR(4)          CHAR(4)               false
      CHAR(5)          CHAR(8)               false
      CHAR(6)          CHAR(8)               false
      CHAR(7)          CHAR(8)               false
      CHAR(8)          CHAR(8)               false
      CHAR(9)          CHAR(8)               true
      CHAR(255)        CHAR(8)               true
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              bar Alexander Barkov
              Reporter:
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:

                  Git Integration