Details
-
Task
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.3.1-1, 10.3.1-2, 10.3.3-1
Description
With CREATE FUNCTION one can create functions in SQL, but this syntax doesn't allow one to create an aggregate function (like SUM, AVG, etc). This task is to add support for aggregate stored functions.
Syntax
is not decided yet. SQL Standard 2003 doesn't support aggregate stored functions (may be the newer standard does). Oracle, PostgreSQL, HSQL (may be more) all implement aggregate stored functions using their own incompatible syntax extensions. SQL Server and DB2 do not support creating new aggregate stored functions in SQL.
The syntax should at least allow for
- the code to return the function value
- the code to be called per row in a group, it accumulates data, does not return a value
- storage that is preserved between calls within a group
and possibly
- code to invoke between groups to reset the storage
- code to remove data from the group (useful for window functions)
Attachments
Issue Links
- relates to
-
MDEV-14520 Custom aggregate functions work incorrectly with WITH ROLLUP clause
-
- Closed
-
-
MDEV-24580 Aggregate stored function return always NULL in case of empty input set
-
- Open
-
-
MCOL-523 support user defined aggregate functions
-
- Closed
-
-
MDEV-11263 Aggregate Window Functions
-
- Stalled
-
-
MDEV-11301 SUGGESTION: Increase the number of Aggregate Functions
-
- Open
-
-
MDEV-14564 support FOR loop in stored aggregate functions
-
- Stalled
-
-
MDEV-28849 I_S tables don't tell that the stored function is AGGREGATE
-
- Open
-
I haven't got anything of significance in a round of tests on bb-10.3-varun 7d8c5ee4039.
Testing of correctness of results is limited due to numerous existing problems in this area, unrelated to the new functionality. I haven't got any functionality-specific crashes.
After fixing
MDEV-14520and addressing all reviews, please go ahead and push to 10.3. If reviews require significant changes in the code, please notify so that I re-run the tests.Please make sure the functionality is documented in the KB, if not done yet.
Among other things, I think it should be specifically mentioned in the documentation that unlike built-in functions, the custom ones don't provide automatic type recognition and act according to the types specified in the function itself. For a simple example, consider the following
delimiter $
begin
loop
begin
loop
delimiter ;
Here, column `i` is integer, column `ci` is text which contains integer values, and column `c` is text.
The in-built function MAX will treat each of them accordingly – the result will be based on integer comparison for `i`, on text comparison for `ci` and `c`.
| 10 | 2 | foo |
Definitions of functions agg_max1 and agg_max2 are identical, except that agg_max1 works with integers and agg_max2 with blobs.
On the same data, agg_max2 will do text comparison for all columns, regardless their initial type
| agg_max2(i) | agg_max2(ci) | agg_max2(c) |
| 2 | 2 | foo |
Function agg_max1 will do integer comparison for anything that looks like integer – in this case, for `i` and for `ci`:
| agg_max1(i) | agg_max1(ci) |
| 10 | 10 |
And for something that doesn't look like integer, it doesn't work at all:
| agg_max1(c) |
| 0 |
Given that functions must define explicit types, I don't see a bug here, but it might be something users don't expect, so better to document it.