Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4753

Performance problem in Typeless join

    XMLWordPrintable

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 6.1.1
    • 6.1.1
    • PrimProc
    • None

    Description

      Joins involving string data types are implemented using so called Typeless join.

      The idea is the following:

      • 1. ExeMgr iterates rows in the small side RowGroup and pack them into "Typeless" representation (i.e. column values written in a single byte array).
      • 2. ExeMgr sends Typeless row representations over the network to PrimProc.
      • 3. PrimProc receives the small side row Typeless prepresentations and feeds them into a hash table.
      • 4. PrimProc iterates through the large side RowGroup, converts every row into Typeless representation again, and searches this Typeless row representation in the hash table. So the large side row is included into the join result set if it is found in the hash table populated by the small side rows.

      The 4th step is a problem. There is no sense to convert the large side from RowGroup/Row format into Typeless format. It's possible to use the Row representation directly.

      The underlying hash and comparison routines should be extended to understand both Typeless and Row formats. So PrimProc can:

      • Calculate the hash of the large side row directly on its Row representation
      • Compare the small side Typeless representation directly to the large side Row representation

      This change will be done on the PrimProc side. ExeMgr most likely won't change its behaviour in any ways.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.