[MDEV-8091] Simple window functions Created: 2015-05-02  Updated: 2017-04-06  Resolved: 2016-09-24

Status: Closed
Project: MariaDB Server
Component/s: Optimizer - Window functions
Fix Version/s: 10.2.2

Type: Task Priority: Major
Reporter: Vicențiu Ciorbaru Assignee: Vicențiu Ciorbaru
Resolution: Fixed Votes: 9
Labels: None

Issue Links:
Relates
relates to MDEV-6115 window functions as in the SQL standard Closed
Sub-Tasks:
Key
Summary
Type
Status
Assignee
MDEV-9695 Wrong window frame when using RANGE B... Technical task Closed Sergei Petrunia  
Sprint: 10.2.0-5, 10.2.0-6, 10.2.0-7, 10.2.0-8, 10.2.2-1, 10.2.2-2, 10.2.2-3, 10.2.2-4

 Description   

There is a class of window functions that can be computed on the fly, after ordering. These functions are:

  • rank (DONE)
  • dense_rank (DONE)
  • row_number (DONE)
  • first_value (this is frame-based) (DONE)

These functions can be computed directly. In order to do this we must:

  1. Sort the rows.
  2. Detect partition boundaries (on the fly as well)
  3. Given partition boundaries, compute the corresponding function value, for each row.

Two-pass window functions:

  • percent_rank (test implementation needs review)
  • cume_dist (see MDEV-9746)
  • ntile
  • nth_value (this is frame-based) (DONE)
  • last_value (this is frame-based) (DONE)
    these require two passes over partition to compute. The extra information that we require is the number of rows in the partition. In order to find the number of rows, we must first detect partition boundaries and then we can compute the number of rows per partition.

Two-cursor window functions:

  • lag (DONE)
  • lead (DONE)
    these require an additional cursor that is traveling n rows ahead/behind the current_row cursor.

Key pieces for implementing this:

  • Make use of the filesort interface, since we do not need temporary tables for these functions.
  • It is a very similar use case to the GROUP BY statement. An important task is figuring out partition boundaries. The classes used for computing GROUP BY, might prove useful.


 Comments   
Comment by Vicențiu Ciorbaru [ 2015-05-02 ]

psergeyigorsanja

Comment by Sergei Petrunia [ 2015-11-26 ]

Overview of the solution sketch that was pushed into 10.1-window branch:

JOIN::exec has a piece of code that detects that

  • the select uses one table
  • all windowing functions have the same ORDER BY clause
  • all windowing functions allow for streaming computation

if this is the case

  • it runs filesort() to sort the source table in the required ordering
  • then, end_send() has a code that calls func->advance_window() for
    all window function items

then

+void Item_window_func::advance_window() {
+  int changed = test_if_group_changed(partition_fields);
+
+  if (changed > -1) {
+    window_func->clear();
+  }
+  window_func->add();
+}

and this computes the window function. It is done on the fly.

Comment by Juan Telleria [ 2017-04-06 ]

¿Is possible to use Window Functions over columns which contain Aggregate Functions (For example: count(Column_Name)?

SELECT
count(Column_Name) AS MyCount,
PERCENT_RANK() OVER (MyCount)
FROM
myTable;

Generated at Thu Feb 08 07:24:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.