[MCOL-4753] Performance problem in Typeless join - Jira

XML

Word

Printable

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 6.1.1
Fix Version/s: 6.1.1
Component/s: PrimProc
Labels:
None

Description

Joins involving string data types are implemented using so called Typeless join.

The idea is the following:

1. ExeMgr iterates rows in the small side RowGroup and pack them into "Typeless" representation (i.e. column values written in a single byte array).
2. ExeMgr sends Typeless row representations over the network to PrimProc.
3. PrimProc receives the small side row Typeless prepresentations and feeds them into a hash table.
4. PrimProc iterates through the large side RowGroup, converts every row into Typeless representation again, and searches this Typeless row representation in the hash table. So the large side row is included into the join result set if it is found in the hash table populated by the small side rows.

The 4th step is a problem. There is no sense to convert the large side from RowGroup/Row format into Typeless format. It's possible to use the Row representation directly.

The underlying hash and comparison routines should be extended to understand both Typeless and Row formats. So PrimProc can:

Calculate the hash of the large side row directly on its Row representation
Compare the small side Typeless representation directly to the large side Row representation

This change will be done on the PrimProc side. ExeMgr most likely won't change its behaviour in any ways.

Attachments

Issue Links

relates to

MCOL-4173 Support JOINs wide-DECIMAL keys.

Closed

MCOL-4755 Allow joins on all numeric data type pairs

Open

Activity

People

Assignee:: Unassigned

Reporter:: Alexander Barkov

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2021-06-08 18:13

Updated:: 2021-06-10 15:40

Resolved:: 2021-06-10 15:40

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.