Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Won't Do
-
None
Description
CS uses Volcano processing approach working on one value at a time. This is very inefficient way for analytics workload that usually uses lots of aggregation functions in projections, filtering or sorting.
We are interested in using JIT for basic aggregation functions: sum, avg, count, min, max. The patch must compile and run a program that processes and returns the aggregation function result. We were written this description having LLVM in mind as it is widespread and has a lots of examples in the wild.
I suggest to start looking at RowAggregation::addRowGroup() from ./utils/rowgroup/rowaggregation.cpp to get what it takmakees to get avg() function result.
Here is the link on how to build fast a CS developer environment.