Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9676

RANGE-type frames for window functions



    • Sprint:


      This is about supporting RANGE-type frames for window functions.

      By support, I mean being able to provide Frame_xxxx range bound classes which are able to "follow" the current row. When the current row moves, range bounds move accordingly, calling sum_item->add() or sum_item->remove() for the rows that go in or out of the frame.

      Bound types

      • UNBOUNDED FOLLOWING and UNBOUNDED PRECEDING have the same meaning as with ROWS-type frames.
      • CURRENT ROW has different meaning
      • n PRECEDING|FOLLOWING have different meanings


      Location of the frame bound depends on the ordering parameter value of the current row.
      We need to be able to do addition/subtraction on the current row value.

      The declared type of SK shall be numeric, datetime, or interval.
      The declared type of UVS shall be numeric if the declared type of SK is numeric;
      otherwise, it shall be an interval type that may be added to or subtracted from the declared
      type of SK according to the Syntax Rules of Subclause 6.30, “<datetime value expression>”,
      and Subclause 6.32, “<interval value expression>”...

      Q: is numeric just INT_RESULT in MySQL terms, or REAL_RESULT and DECIMAL_RESULT are allowed also?
      A: yes, DECIMAL and DOUBLE are allowed. Anything that allows addition.

      Q: what should we use for addition? (Items or add manually?)
      A: We must use a different addition operation, depending on the datatypes of constants involved. So, it's better to use Items.

      Q: what is "interval type" in MySQL codebase? Am I correct that date addition/subtraction is only used inside DATE_ADD() ?
      A: Yes. DATE_ADD and DATE_SUB have date[time] INTERVALs as parameters. Intervals are not supported outside these functions.

      RANGE ... CURRENT ROW bound

      The standard draft says for bound #1:

      If WFB1 specifies CURRENT ROW, then remove from WF all rows that are not peers of the current row and that precede the current row in the window ordering defined by WD.

      That is, the cursor should stop as soon as it reaches the first peer of the current row (the first peer itself is not included).

      For bound #2:

      If WFB2 specifies CURRENT ROW, then remove from WF all rows following the current row in the ordering defined by WD that are not peers of the current row.

      That is, the cursor should be ahead of the current row, and stop as soon as it reaches the first non-peer of the current row.

      "Current row" problem

      (this is orthoginal to CURRENT ROW frame bound) Consider


      Suppose, we're moving to compute the sum function value for the next row.

      1. We get the rowid for the next current_row
      2. We need to read current_row
      2.1 We compute top_bound_value = (current_row.col1-10) and this tells us where the first range bound should be.
      2.2 We move the top bound to be at the last row that still has
      row.col1 < top_bound_value, calling SUM->remove() for rows that go out of the frame.
      2.3 We do the same for bottom_bound.
      2.4 Now, we want to update the value for the current_row

      The problem with operation 2.4 is that actions made in steps 2.2 and 2.3 have touched temp.table's handler object.
      If we want to update the current row, we need to read it again.
      We can do that by calling temp_table->file->rnd_pos() but that's one extra row read per each row.


          Issue Links



              psergey Sergei Petrunia
              psergey Sergei Petrunia
              0 Vote for this issue
              2 Start watching this issue