[MCOL-4753] Performance problem in Typeless join Created: 2021-06-08 Updated: 2021-06-10 Resolved: 2021-06-10 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | PrimProc |
| Affects Version/s: | 6.1.1 |
| Fix Version/s: | 6.1.1 |
| Type: | Task | Priority: | Major |
| Reporter: | Alexander Barkov | Assignee: | Unassigned |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Description |
|
Joins involving string data types are implemented using so called Typeless join. The idea is the following:
The 4th step is a problem. There is no sense to convert the large side from RowGroup/Row format into Typeless format. It's possible to use the Row representation directly. The underlying hash and comparison routines should be extended to understand both Typeless and Row formats. So PrimProc can:
This change will be done on the PrimProc side. ExeMgr most likely won't change its behaviour in any ways. |
| Comments |
| Comment by Alexander Barkov [ 2021-06-10 ] | ||||||||||||||||||||||||||||||||
|
A script to measure the performance:
| ||||||||||||||||||||||||||||||||
| Comment by Alexander Barkov [ 2021-06-10 ] | ||||||||||||||||||||||||||||||||
|
The patch in PR#1983 demonstrates the following performance improvement:
The SELECT query was tested with help of sysbench with this command:
|