
Type: Task

Status: Closed (View Workflow)

Priority: Major

Resolution: Fixed

Fix Version/s: 10.2.2

Component/s: Optimizer  Window functions

Labels:None

Sprint:10.2.05, 10.2.06, 10.2.07, 10.2.08, 10.2.21, 10.2.22, 10.2.23, 10.2.24
There is a class of window functions that can be computed on the fly, after ordering. These functions are:
rank(DONE)dense_rank(DONE)row_number(DONE)first_value (this is framebased)(DONE)
These functions can be computed directly. In order to do this we must:
 Sort the rows.
 Detect partition boundaries (on the fly as well)
 Given partition boundaries, compute the corresponding function value, for each row.
Twopass window functions:
percent_rank (test implementation needs review)cume_dist (seeMDEV9746)ntilenth_value (this is framebased)(DONE)last_value (this is framebased)(DONE)
these require two passes over partition to compute. The extra information that we require is the number of rows in the partition. In order to find the number of rows, we must first detect partition boundaries and then we can compute the number of rows per partition.
Twocursor window functions:
lag(DONE)lead(DONE)
these require an additional cursor that is traveling n rows ahead/behind the current_row cursor.
Key pieces for implementing this:
 Make use of the filesort interface, since we do not need temporary tables for these functions.
 It is a very similar use case to the GROUP BY statement. An important task is figuring out partition boundaries. The classes used for computing GROUP BY, might prove useful.
 relates to

MDEV6115 window functions as in the SQL standard
 Closed