Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
10.2.0-9
Description
CUME_DIST is a bit special. It needs to either
1. remember a value for each peer group somewhere,
2. three cursors.
Details:
create table t1 (
|
pk int,
|
a int
|
);
|
|
insert into t1 values
|
(1, 20),
|
(2, 20),
|
(3, 20),
|
(4, 20),
|
(5, 30),
|
(6, 30),
|
(7, 30),
|
(8, 40);
|
select pk,a,cume_dist() over (order by a) from t1;
|
+------+------+
|
| pk | a |
|
+------+------+
|
| 1 | 20 | 0.5
|
| 2 | 20 | 0.5
|
| 3 | 20 | 0.5
|
| 4 | 20 | 0.5
|
| 5 | 30 | 0.875
|
| 6 | 30 | 0.875
|
| 7 | 30 | 0.875
|
| 8 | 40 | 1
|
+------+------+
|
How to compute these?
1. The first pass needs to find how many rows are in the partition.
(Q: BTW, is it true that "any two-pass window function needs the first pass
only to find #rows in the partition"?)
2. On the second pass, we need
2.1 find how many peers are in the peer group (5 rows have a=20)
2.2 compute the value and set it for every peer member.
Possible ways to do step#2:
A: make it in two passes:
- first, let the first row of the group store #values in the group
- second, propagate this value to all group members.
B: Use two cursors.
- The front cursor reaches the end of the peer group, remembers how many values
it saw and then stays there. - The back cursor walks through the peer group and updates the values.
Attachments
Issue Links
- is part of
-
MDEV-6115 window functions as in the SQL standard
- Closed