Details

    • 10.3.1-1, 10.3.1-2, 10.3.3-1

    Description

      With CREATE FUNCTION one can create functions in SQL, but this syntax doesn't allow one to create an aggregate function (like SUM, AVG, etc). This task is to add support for aggregate stored functions.

      Syntax

      is not decided yet. SQL Standard 2003 doesn't support aggregate stored functions (may be the newer standard does). Oracle, PostgreSQL, HSQL (may be more) all implement aggregate stored functions using their own incompatible syntax extensions. SQL Server and DB2 do not support creating new aggregate stored functions in SQL.

      The syntax should at least allow for

      • the code to return the function value
      • the code to be called per row in a group, it accumulates data, does not return a value
      • storage that is preserved between calls within a group

      and possibly

      • code to invoke between groups to reset the storage
      • code to remove data from the group (useful for window functions)

      Attachments

        Issue Links

          Activity

            bar Alexander Barkov added a comment - - edited

            Good idea to build CREATE TYPE inside CREATE [AGGREGATE] FUNCTION.
            We should probably try to define method bodies inline (without having separate CREATE METHOD statements).
            This would solve the problem with concurrent connections.

            bar Alexander Barkov added a comment - - edited Good idea to build CREATE TYPE inside CREATE [AGGREGATE] FUNCTION. We should probably try to define method bodies inline (without having separate CREATE METHOD statements). This would solve the problem with concurrent connections.

            The current test case seems to corrupt the procs table:

            delimiter |;
             
            create aggregate function agg_sum(x INT) returns double
            begin
            declare z double default 0;
            declare continue handler for not found return z;
             
            loop
            fetch group next row;
            set z = z + x;
            end loop;
            end|
             
            delimiter ;|
             
            create table t1 (id int, salary int);
             
            INSERT INTO t1 VALUES (1, 100), (2, 40), (3, 6);
             
            show create function agg_sum;
            alter function agg_sum aggregate none;
            show create function agg_sum; 
            

            Final query fails with:

            # This fails with 'show create function agg_sum' failed: 1457: Failed to load routine test.agg_sum. The table mysql.proc is missing, corrupt, or contains bad data (internal code -6)
            

            cvicentiu VicenČ›iu Ciorbaru added a comment - The current test case seems to corrupt the procs table: delimiter |;   create aggregate function agg_sum(x INT ) returns double begin declare z double default 0; declare continue handler for not found return z;   loop fetch group next row; set z = z + x; end loop; end |   delimiter ;|   create table t1 (id int , salary int );   INSERT INTO t1 VALUES (1, 100), (2, 40), (3, 6);   show create function agg_sum; alter function agg_sum aggregate none; show create function agg_sum; Final query fails with: # This fails with 'show create function agg_sum' failed: 1457: Failed to load routine test.agg_sum. The table mysql.proc is missing, corrupt, or contains bad data (internal code -6)

            Just a few minor dead-code removal and history cleanup, then we can push.

            cvicentiu Vicențiu Ciorbaru added a comment - Just a few minor dead-code removal and history cleanup, then we can push.

            A final rebase was required, resolved the conflicts and have sent it to cvicentiu for review.
            Have pushed it to bb-10.3-varun for final testing

            varun Varun Gupta (Inactive) added a comment - A final rebase was required, resolved the conflicts and have sent it to cvicentiu for review. Have pushed it to bb-10.3-varun for final testing

            I haven't got anything of significance in a round of tests on bb-10.3-varun 7d8c5ee4039.
            Testing of correctness of results is limited due to numerous existing problems in this area, unrelated to the new functionality. I haven't got any functionality-specific crashes.
            After fixing MDEV-14520 and addressing all reviews, please go ahead and push to 10.3. If reviews require significant changes in the code, please notify so that I re-run the tests.

            Please make sure the functionality is documented in the KB, if not done yet.

            Among other things, I think it should be specifically mentioned in the documentation that unlike built-in functions, the custom ones don't provide automatic type recognition and act according to the types specified in the function itself. For a simple example, consider the following

            create or replace table t1 (i int, c char(8), ci char(8));
            insert into t1 values (10,'foo','10'),(2,'bar','2');
             
            delimiter $
            create or replace aggregate function agg_max1(x INT) returns INT
            begin
              declare m INT default NULL;
              declare continue handler for not found return m;
              loop
               fetch group next row;
               if (x is not null) and (m is null or m < x) then
                  set m= x;
               end if;
              end loop;
            end $
             
            create or replace aggregate function agg_max2(x BLOB) returns BLOB
            begin
              declare m BLOB default NULL;
              declare continue handler for not found return m;
              loop
               fetch group next row;
               if (x is not null) and (m is null or m < x) then
                  set m= x;
               end if;
              end loop;
            end $
            delimiter ;
            

            Here, column `i` is integer, column `ci` is text which contains integer values, and column `c` is text.
            The in-built function MAX will treat each of them accordingly – the result will be based on integer comparison for `i`, on text comparison for `ci` and `c`.

            MariaDB [test]> select max(i), max(ci), max(c) from t1;
            +--------+---------+--------+
            | max(i) | max(ci) | max(c) |
            +--------+---------+--------+
            |     10 | 2       | foo    |
            +--------+---------+--------+
            1 row in set (0.00 sec)
            

            Definitions of functions agg_max1 and agg_max2 are identical, except that agg_max1 works with integers and agg_max2 with blobs.

            On the same data, agg_max2 will do text comparison for all columns, regardless their initial type

            MariaDB [test]> select agg_max2(i), agg_max2(ci), agg_max2(c) from t1;
            +-------------+--------------+-------------+
            | agg_max2(i) | agg_max2(ci) | agg_max2(c) |
            +-------------+--------------+-------------+
            | 2           | 2            | foo         |
            +-------------+--------------+-------------+
            1 row in set (0.00 sec)
            

            Function agg_max1 will do integer comparison for anything that looks like integer – in this case, for `i` and for `ci`:

            MariaDB [test]> select agg_max1(i), agg_max1(ci) from t1;
            +-------------+--------------+
            | agg_max1(i) | agg_max1(ci) |
            +-------------+--------------+
            |          10 |           10 |
            +-------------+--------------+
            1 row in set (0.00 sec)
            

            And for something that doesn't look like integer, it doesn't work at all:

            MariaDB [test]> set sql_mode= 'STRICT_ALL_TABLES';
            Query OK, 0 rows affected (0.02 sec)
             
            MariaDB [test]> select agg_max1(c) from t1;
            ERROR 1366 (22007): Incorrect integer value: 'foo' for column 'x' at row 1
            MariaDB [test]> set sql_mode= '';
            Query OK, 0 rows affected (0.00 sec)
             
            MariaDB [test]> select agg_max1(c) from t1;
            +-------------+
            | agg_max1(c) |
            +-------------+
            |           0 |
            +-------------+
            1 row in set, 3 warnings (0.00 sec)
            

            Given that functions must define explicit types, I don't see a bug here, but it might be something users don't expect, so better to document it.

            elenst Elena Stepanova added a comment - I haven't got anything of significance in a round of tests on bb-10.3-varun 7d8c5ee4039. Testing of correctness of results is limited due to numerous existing problems in this area, unrelated to the new functionality. I haven't got any functionality-specific crashes. After fixing MDEV-14520 and addressing all reviews, please go ahead and push to 10.3. If reviews require significant changes in the code, please notify so that I re-run the tests. Please make sure the functionality is documented in the KB, if not done yet. Among other things, I think it should be specifically mentioned in the documentation that unlike built-in functions, the custom ones don't provide automatic type recognition and act according to the types specified in the function itself. For a simple example, consider the following create or replace table t1 (i int , c char (8), ci char (8)); insert into t1 values (10, 'foo' , '10' ),(2, 'bar' , '2' );   delimiter $ create or replace aggregate function agg_max1(x INT ) returns INT begin declare m INT default NULL ; declare continue handler for not found return m; loop fetch group next row; if (x is not null ) and (m is null or m < x) then set m= x; end if ; end loop; end $   create or replace aggregate function agg_max2(x BLOB) returns BLOB begin declare m BLOB default NULL ; declare continue handler for not found return m; loop fetch group next row; if (x is not null ) and (m is null or m < x) then set m= x; end if ; end loop; end $ delimiter ; Here, column `i` is integer, column `ci` is text which contains integer values, and column `c` is text. The in-built function MAX will treat each of them accordingly – the result will be based on integer comparison for `i` , on text comparison for `ci` and `c` . MariaDB [test]> select max (i), max (ci), max (c) from t1; + --------+---------+--------+ | max (i) | max (ci) | max (c) | + --------+---------+--------+ | 10 | 2 | foo | + --------+---------+--------+ 1 row in set (0.00 sec) Definitions of functions agg_max1 and agg_max2 are identical, except that agg_max1 works with integers and agg_max2 with blobs. On the same data, agg_max2 will do text comparison for all columns, regardless their initial type MariaDB [test]> select agg_max2(i), agg_max2(ci), agg_max2(c) from t1; + -------------+--------------+-------------+ | agg_max2(i) | agg_max2(ci) | agg_max2(c) | + -------------+--------------+-------------+ | 2 | 2 | foo | + -------------+--------------+-------------+ 1 row in set (0.00 sec) Function agg_max1 will do integer comparison for anything that looks like integer – in this case, for `i` and for `ci` : MariaDB [test]> select agg_max1(i), agg_max1(ci) from t1; + -------------+--------------+ | agg_max1(i) | agg_max1(ci) | + -------------+--------------+ | 10 | 10 | + -------------+--------------+ 1 row in set (0.00 sec) And for something that doesn't look like integer, it doesn't work at all: MariaDB [test]> set sql_mode= 'STRICT_ALL_TABLES' ; Query OK, 0 rows affected (0.02 sec)   MariaDB [test]> select agg_max1(c) from t1; ERROR 1366 (22007): Incorrect integer value: 'foo' for column 'x' at row 1 MariaDB [test]> set sql_mode= '' ; Query OK, 0 rows affected (0.00 sec)   MariaDB [test]> select agg_max1(c) from t1; + -------------+ | agg_max1(c) | + -------------+ | 0 | + -------------+ 1 row in set , 3 warnings (0.00 sec) Given that functions must define explicit types, I don't see a bug here, but it might be something users don't expect, so better to document it.

            People

              varun Varun Gupta (Inactive)
              serg Sergei Golubchik
              Votes:
              3 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.