[MDEV-8100] Table functions (aka SQL functions returning tables) Created: 2015-05-04  Updated: 2023-10-20

Status: Stalled
Project: MariaDB Server
Component/s: Data Definition - Procedure
Fix Version/s: None

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Unassigned
Resolution: Unresolved Votes: 3
Labels: gsoc15, gsoc16

Issue Links:
Duplicate
duplicates MDEV-10115 Processing Data Sets Within A stored ... Closed
duplicates MDEV-15077 Insert from 'call proc', returning rows Closed
PartOf
includes MDEV-8342 Support SELECT ... FROM table_func(..... Closed
includes MDEV-8343 Support table function declarations i... Closed
includes MDEV-8344 Table functions: make name resolution... Closed
includes MDEV-8356 Table functions: Make sure CREATE FUN... Closed
includes MDEV-8389 Table functions: create the result ta... Closed
includes MDEV-8649 Table functions: SHOW FUNCTION STATUS... Closed
includes MDEV-8650 Table function: DROP FUNCTION doesn't... Closed
includes MDEV-15457 Pipe function technique Closed
is part of MDEV-10137 Providing compatibility to other data... Open
is part of MDEV-11070 Providing compatibility to other data... Open
Relates
relates to MDEV-5199 Table functions (a.k.a UDF returning ... Open
relates to MDEV-7535 query the metadata of a dynamic column Open

 Description   

This is a part of MDEV-5199 (Table functions) which addresses table functions that are defined in SQL (as opposed to table functions that are implemented in C++ and loaded as .so/.dlls).

This task is a GSOC '15 project of dj not anymore

Proposed syntax

(note: the following differs from description in MDEV-5199, and this is the latest text).

We will use SQL Server's variant of syntax, where CREATE FUNCTION explicitly mentions function name. To define a table function one will use:

CREATE FUNCTION func_name (args)  RETURNS TABLE table_name (column type [, column type ...])

Inside function body, table table_name is visible as a temporary table. The table is initially empty. The code of the function is expected to insert rows into table_name. When the function finishes, the table will be returned to outside as the function's result.

Implementation overview

Table function will create and populate a temporary table (either HEAP or Aria/MyISAM, just like it is done for other kinds of temporary tables). Once the table function has finished populating the temporary table, it will be returned to the statement that invoked the table function. The caller statement will then read the table.

In other words, we will not support any kind of "streaming" for results of table function.

A possible optimization

Trivial functions (with the body of a single INSERT ... SELECT) can be treated as a VIEW. It is not clear whether this should be implemented in the scope of this WL entry.

Other details

Constant parameters

Table functions must have constant parameters. One can't use things like

select * from tbl1,  table_func(tbl1.column) where ...

This is because our implementation will pre-populate table function tables once per query execution. MySQL/MariaDB optimizer doesn't support tables that may change their contents during query execution.

Parser

The parser should support all proposed syntax.
note: the table can only be addressed through an alias. This syntax is not
allowed:

SELECT table_func(1,2).column FROM table_func(1,2)

one must use

SELECT T_ALIAS.column FROM table_func(1,2) as T_ALIAS

Name resolution/preparation in the upper query

When we find a table reference in the FROM clause, we should make
a lookup in mysql.proc, make appropriate checks, then open and parse
the function.
We should be able to take the "RETURNS ...." statement and produce a TABLE
object from it. This is needed for doing name resolution on the other parts
of the "caller" query .

Name resolution/preparation in the Stored Function

When one defines a stored function:

CREATE FUNCTION func_name (args)  RETURNS TABLE table_name (column type [, column type ...])
BEGIN
   stmt1;
   stmt2;
END

All statements inside the body of the function (i.e. stmt1, stmt2, etc) must use a modified name resolution process where table table_name resolves to the table that will be returned by the function. This may shadow the real table with name table_name.

mysql.proc contents

How does table functions should be stored in mysql.proc table. Are they a special kind of

Execution

  • "SELECT sp_func()" should return error for table functions


 Comments   
Comment by Sergei Golubchik [ 2015-05-04 ]

What statements are allowed for a result table_name inside a table function? I would think that SELECT/INSERT/UPDATE/DELETE should certainly be allowed. DROP (and, thus, CREATE OR REPLACE) should not be. TRUNCATE could. CREATE TABLE LIKE could. ALTER TABLE could not. In short, it seems that no DDL statements should be allowed and DML statements should be.

Can this table have indexes? I don't see why not. They serve two purposes — one might need a unique constraint, and the optimizer might find these indexes useful.

Comment by Sergei Petrunia [ 2015-06-19 ]

https://lists.launchpad.net/maria-developers/msg08731.html

Comment by diwas joshi [ 2015-06-21 ]

Making progress on this project we have successfully parsed the input string for both create and select queries. There were some issues relating to shift/reduce conflicts coming up while parsing select queries, which were removed by sergei and me after discussing on it. As part of this both tasks https://mariadb.atlassian.net/browse/MDEV-8342 and https://mariadb.atlassian.net/browse/MDEV-8343 have been completed. We'll be continuing with next set of tasks.

Comment by Sergei Petrunia [ 2015-06-22 ]

The syntax in the task description says:

 CREATE FUNCTION ... RETURNS TABLE table_name (column type [, column type ...])

Actually it is

 CREATE FUNCTION func_name (args)  RETURNS TABLE table_name (column type [, column type ...])

Comment by Sergei Petrunia [ 2015-06-22 ]

Updated the description

Comment by Sergei Petrunia [ 2015-06-27 ]

The tree for this project is here: https://github.com/djdij123/server.git

Comment by Sergei Petrunia [ 2015-11-26 ]

Closed the subtasks as nobody is working on the patch currently. This task itself remains open in case somebody wants to pick it up.

Comment by Alexander Barkov [ 2017-03-14 ]

Please consider implementing tables as a new data type, so one can declare variables of the new TABLE data type and pass those variables as arguments to stored routines. This is needed for the Oracle compatibility project.

Something like this would be nice:

DELIMITER $$
CREATE PROCEDURE p1(t1 TABLE(a INT, b VARCHAR(32))
BEGIN
  SELECT * FROM t1;
  INSERT INTO t2 SELECT * FROM t1;
END;
$$
CREATE PROCEDURE p2
BEGIN
  DECLARE t1 TABLE(a INT, b VARCHAR(32));
  INSERT INTO t1 VALUES (1,'b1');
  CALL p1(t1);
END;
$$
CALL p2();

Comment by Alexander Barkov [ 2017-03-14 ]

Note, the bb-10.2-compatibility tree has a support for the ROW data type. The ROW data type can be used for local variables and parameter.

See "MDEV-10914 ROW data type for stored routine variables" for details, as well as tests in mysql-test/t/sp-row.test.

Adding the TABLE data type can reuse ideas from the patch implementing MDEV-10914.

Note, returning ROW variables from functions is not implemented yet. But this is mostly because of the parser related challenges, as as soon as we return ROW from a function, we need to support this:

SELECT f1().x FROM DUAL;

Otherwise adding ROW as a function return data type is easy. So there should not be any problems to implement TABLE as a function return value as soon as you have TABLE as a variable/parameter data type.

Generated at Thu Feb 08 07:24:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.