[MDEV-23886] Reusing CTE inside a function fails with table doesn't exist - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.5.5, 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
Fix Version/s: 10.6.2, 10.2.39, 10.3.30, 10.4.20, 10.5.11
Component/s: Optimizer - CTE
Labels:
None

Description

When I try to use a single CTE multiple times inside a stored function, the function fails with
ERROR 1146 (42S02): Table 'test.testtab' doesn't exist

DROP TABLE IF EXISTS testtab;

CREATE TABLE testtab (c1 int(11) );

INSERT INTO testtab VALUES (1),(2),(6);

DROP FUNCTION IF EXISTS TESTFUN;

DELIMITER |

CREATE FUNCTION TESTFUN() RETURNS INT DETERMINISTIC BEGIN

RETURN (

    WITH

        cte1 as (select c1 from testtab)

-- here

    SELECT SUM(c1) FROM ( SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 ) as u1

);

END|

DELIMITER ;

SELECT TESTFUN();

Returns -> ERROR 1146 (42S02): Table 'test.testtab' doesn't exist
Expected -> 18

Running the same query without wrapping it in a function works fine:

    WITH

        cte1 as (select c1 from testtab)

-- here

    SELECT SUM(c1) FROM ( SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 ) as u1;

Returns -> 18

The same error occurs when:

CTE is self-joined in a query - inside a stored function
multiple branches of a CASE ... END use it in tests: WHEN EXISTS (SELECT NULL FROM CTE WHERE X=Y) THEN "result 1" WHEN EXISTS ( SELECT NULL FROM CTE ) THEN "result 2" END - inside a stored function

Attachments

Issue Links

blocks

MDEV-24488 Cannot use CTE in trigger definition

Confirmed

is duplicated by

MDEV-25464 Joining CTEs inside a function fails with error code 1146 despite the tables existing

Closed

relates to

MDEV-24019 Assertion is hit for query using recursive CTE with no default DB

Closed

MDEV-25601 MariaDB 10.5 crash when execute query with SET

Closed

MDEV-26202 Unexpected failure with query using indirectly a recursive CTE twice

Closed

Activity

Ascending order - Click to sort in descending order

Szatox created issue - 2020-10-04 15:42

Szatox made changes - 2020-10-04 15:57

Field	Original Value	New Value
Description	When I try to use a single CTE multiple times inside a stored function, the function fails with ERROR 1146 (42S02): Table 'test.testtab' doesn't exist {code}DROP TABLE IF EXISTS testtab; CREATE TABLE testtab (c1 int(11) ); INSERT INTO testtab VALUES (1),(2),(6); DROP FUNCTION IF EXISTS TESTFUN; DELIMITER \| CREATE FUNCTION TESTFUN() RETURNS INT DETERMINISTIC BEGIN RETURN ( WITH cte1 as (select c1 from testtab) -- here SELECT SUM(c1) FROM ( SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 ) as u1 ); END\| DELIMITER ; SELECT TESTFUN();{code} Returns -> ERROR 1146 (42S02): Table 'test.testtab' doesn't exist Expected -> 18 Running the same query _without_ wrapping it in a function works fine: {code} WITH cte1 as (select c1 from testtab) -- here SELECT SUM(c1) FROM ( SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 ) as u1; {code} Returns -> 18	When I try to use a single CTE multiple times inside a stored function, the function fails with ERROR 1146 (42S02): Table 'test.testtab' doesn't exist {code}DROP TABLE IF EXISTS testtab; CREATE TABLE testtab (c1 int(11) ); INSERT INTO testtab VALUES (1),(2),(6); DROP FUNCTION IF EXISTS TESTFUN; DELIMITER \| CREATE FUNCTION TESTFUN() RETURNS INT DETERMINISTIC BEGIN RETURN ( WITH cte1 as (select c1 from testtab) -- here SELECT SUM(c1) FROM ( SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 ) as u1 ); END\| DELIMITER ; SELECT TESTFUN();{code} Returns -> ERROR 1146 (42S02): Table 'test.testtab' doesn't exist Expected -> 18 Running the same query _without_ wrapping it in a function works fine: {code} WITH cte1 as (select c1 from testtab) -- here SELECT SUM(c1) FROM ( SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 ) as u1; {code} Returns -> 18 The same error occurs when: * CTE is self-joined in a query - inside a stored function * multiple branches of a CASE ... END use it in tests: WHEN EXISTS (SELECT NULL FROM CTE WHERE X=Y) THEN "result 1" WHEN EXISTS ( SELECT NULL FROM CTE ) THEN "result 2" END - inside a stored function

Alice Sherepa made changes - 2020-10-05 08:14

Component/s

Optimizer - CTE [ 13513 ]

Alice Sherepa made changes - 2020-10-05 08:41

Affects Version/s		10.2 [ 14601 ]
Affects Version/s		10.3 [ 22126 ]
Affects Version/s		10.4 [ 22408 ]
Affects Version/s		10.5 [ 23123 ]

Alice Sherepa made changes - 2020-10-05 08:41

Fix Version/s		10.2 [ 14601 ]
Fix Version/s		10.3 [ 22126 ]
Fix Version/s		10.4 [ 22408 ]
Fix Version/s		10.5 [ 23123 ]

Alice Sherepa made changes - 2020-10-05 08:41

Status

Open [ 1 ]

Confirmed [ 10101 ]

Alice Sherepa made changes - 2020-10-05 08:41

Assignee

Igor Babaev [ igor ]

Alice Sherepa added a comment - 2020-10-05 08:45

Thanks a lot! Reproducible on 10.2-10.5:

create table t1 (c1 int);

insert into t1 values (1),(2),(6);

create function f1() returns int return

(with cte1 as (select c1 from t1) select sum(c1) from

( select * from cte1 union all select * from cte1) u1);

select f1();

main.1_my                                [ fail ]

        Test ended at 2020-10-05 10:44:13

CURRENT_TEST: main.1_my

mysqltest: At line 9: query 'select f1()' failed: 1146: Table 'test.t1' doesn't exist

Alice Sherepa added a comment - 2020-10-05 08:45 Thanks a lot! Reproducible on 10.2-10.5: create table t1 (c1 int); insert into t1 values (1),(2),(6); create function f1() returns int return (with cte1 as (select c1 from t1) select sum(c1) from ( select * from cte1 union all select * from cte1) u1); select f1(); main.1_my [ fail ] Test ended at 2020-10-05 10:44:13 CURRENT_TEST: main.1_my mysqltest: At line 9: query 'select f1()' failed: 1146: Table 'test.t1' doesn't exist

Igor Babaev (Inactive) made changes - 2020-10-07 04:20

Status

Confirmed [ 10101 ]

In Progress [ 3 ]

Igor Babaev (Inactive) added a comment - 2020-10-08 17:42

This bug practically prevents using queries with more than one reference to the same CTE in stored procedures / stored functions. That's why I marked it as 'Critical'.
The problem is with a special optimization if prelocking mechanism used for SP/SF. This optimization makes impossible to resolve CTE when opening the tables. One of the solution would be elimination of all references to CTE at the very end of the parsing phase for each query rather than in open_and_process_table() as it's done now. This would be a good architectural solution as well. In this case the references to CTE would be substituted for CTE specifications when PREPARE command was executed. Yet this would require a recursive call of the parser when parsing SF/SP.

Igor Babaev (Inactive) added a comment - 2020-10-08 17:42 This bug practically prevents using queries with more than one reference to the same CTE in stored procedures / stored functions. That's why I marked it as 'Critical'. The problem is with a special optimization if prelocking mechanism used for SP/SF. This optimization makes impossible to resolve CTE when opening the tables. One of the solution would be elimination of all references to CTE at the very end of the parsing phase for each query rather than in open_and_process_table() as it's done now. This would be a good architectural solution as well. In this case the references to CTE would be substituted for CTE specifications when PREPARE command was executed. Yet this would require a recursive call of the parser when parsing SF/SP.

Igor Babaev (Inactive) made changes - 2020-10-29 18:26

Priority

Major [ 3 ]

Critical [ 2 ]

Roel Van de Paar made changes - 2020-11-10 22:09

Link

This issue relates to ~~MDEV-24019~~ [ ~~MDEV-24019~~ ]

Alice Sherepa made changes - 2021-04-20 15:39

Link

This issue is duplicated by ~~MDEV-25464~~ [ ~~MDEV-25464~~ ]

Alice Sherepa added a comment - 2021-04-20 15:52

one more test case (~~MDEV-25464~~)

create table t1 (id1 int);

insert into t1 values (1), (2);

create function f1() returns int return

(with cte1 as (select t1.id1 from t1), cte2 as (select 1 from cte1)

	select 1 from cte1 limit 1);

select f1();

Alice Sherepa added a comment - 2021-04-20 15:52 one more test case ( MDEV-25464 ) create table t1 (id1 int); insert into t1 values (1), (2); create function f1() returns int return (with cte1 as (select t1.id1 from t1), cte2 as (select 1 from cte1) select 1 from cte1 limit 1); select f1();

Igor Babaev (Inactive) made changes - 2021-05-05 23:26

Link

This issue blocks MDEV-24488 [ MDEV-24488 ]

Igor Babaev (Inactive) added a comment - 2021-05-06 00:46 - edited

In the current code binding of a table reference to the specification of the corresponding CTE happens in open_and_process_table(). If the table reference is not the first in the query the specification is cloned in the same way as it is cloned for any view reference. This works fine for standalone queries, but does not work for stored procedures / functions for the following reason.
When the first call of a stored procedure/ function SP is processed the body of SP is parsed. When a query of SP is parsed the info on each encountered table reference is put into a TABLE_LIST object linked into a global chain associated with the query. When parsing of the query is finished the basic info on the table references from this chain except table references to derived tables and information schema tables is put in one hash table associated with SP.
When parsing of the body of SP is finished this hash table is used to construct TABLE_LIST objects for all table references mentioned in the table and link them into the list of such objects passed to a pre-locking process that calls open_and_process_table() for each table from the list.
When a TABLE_LIST for a view is encountered the view is opened and its specification is parsed. For any table reference occurred in the specification a new TABLE_LIST object is created to be included into the list for pre-locking. After all objects in the pre-locking have been looked through the tables mentioned in the list are locked. Note that the objects referenced CTEs are just skipped here as it is impossible to resolve these references without any info on the context where they occur. Now the statements from the body of SP are executed one by one that. At the very beginning of the execution of a query the tables used in the query are opened and open_and_process_table() now is called for each table reference mentioned in the list of TABLE_LIST objects associated with the query that was built when the query was parsed.
For each table reference first it is checked against CTEs definitions in whose scope it occurred. If such definition is found the reference is considered resolved and if this is not the first reference to the found CTE the the specification of the CTE is re-parsed and the result of the parsing is added to the parsing tree of the query as a sub-tree. If this sub-tree contains table references to other tables they are added the list of TABLE_LIST objects associated with the query in order the referenced tables to be opened. When the procedure that opens the tables comes to the TABLE_LIST object created for a non-first reference to a CTE it discovers that the referenced table instance is not locked and reports an error.
Thus processing non-first table references to a CTE similar to how references to view are processed does not work for queries used in stored procedures / functions. And the main problem is that the current pre-locking mechanism employed for stored procedures / functions does not allow to save the context in which a CTE reference occur.
It's not trivial to save the info about the context where a CTE reference occurs while the resolution of the table reference cannot be done without this context and consequentially the specification for the table reference cannot be determined.

Igor Babaev (Inactive) added a comment - 2021-05-06 00:46 - edited In the current code binding of a table reference to the specification of the corresponding CTE happens in open_and_process_table(). If the table reference is not the first in the query the specification is cloned in the same way as it is cloned for any view reference. This works fine for standalone queries, but does not work for stored procedures / functions for the following reason. When the first call of a stored procedure/ function SP is processed the body of SP is parsed. When a query of SP is parsed the info on each encountered table reference is put into a TABLE_LIST object linked into a global chain associated with the query. When parsing of the query is finished the basic info on the table references from this chain except table references to derived tables and information schema tables is put in one hash table associated with SP. When parsing of the body of SP is finished this hash table is used to construct TABLE_LIST objects for all table references mentioned in the table and link them into the list of such objects passed to a pre-locking process that calls open_and_process_table() for each table from the list. When a TABLE_LIST for a view is encountered the view is opened and its specification is parsed. For any table reference occurred in the specification a new TABLE_LIST object is created to be included into the list for pre-locking. After all objects in the pre-locking have been looked through the tables mentioned in the list are locked. Note that the objects referenced CTEs are just skipped here as it is impossible to resolve these references without any info on the context where they occur. Now the statements from the body of SP are executed one by one that. At the very beginning of the execution of a query the tables used in the query are opened and open_and_process_table() now is called for each table reference mentioned in the list of TABLE_LIST objects associated with the query that was built when the query was parsed. For each table reference first it is checked against CTEs definitions in whose scope it occurred. If such definition is found the reference is considered resolved and if this is not the first reference to the found CTE the the specification of the CTE is re-parsed and the result of the parsing is added to the parsing tree of the query as a sub-tree. If this sub-tree contains table references to other tables they are added the list of TABLE_LIST objects associated with the query in order the referenced tables to be opened. When the procedure that opens the tables comes to the TABLE_LIST object created for a non-first reference to a CTE it discovers that the referenced table instance is not locked and reports an error. Thus processing non-first table references to a CTE similar to how references to view are processed does not work for queries used in stored procedures / functions. And the main problem is that the current pre-locking mechanism employed for stored procedures / functions does not allow to save the context in which a CTE reference occur. It's not trivial to save the info about the context where a CTE reference occurs while the resolution of the table reference cannot be done without this context and consequentially the specification for the table reference cannot be determined.

Igor Babaev (Inactive) added a comment - 2021-05-10 17:42

Let's see how exactly we come to a bogus error message for a call of the function:

create function f1() returns int return

( with cte1 as (select c1 from t1)

select sum(c1) from

(select * from cte1 as s union all select * from cte1 as t) dt

);

The function contains 4 table references with aliases t1, s,t, dt.
When the query

select f1();

is processed for the first time the function open_and_process_routine() is called that in its turn calls sp_cache_routine() starting parsing of the body of the function f1. The only query of the body is parsed and the chain of 4 TABLE_LIST objects for table references t1,s,t,dt is attached to the query. At the very end of the parsing process for the query the function sp_head::merge_table_list looks through this chain and puts some basic info about the references to t1,s,t into a hash table. When parsing of f1 is finished the control returns to open_and_process_routine() and the function DML_prelocking_strategy::handle_routine is called. The latter invokes the function sp_head::add_used_tables_to_table_list() that looks through the hash table mentioned above and using the info from the elements in table construct a chain of TABLE_LIST objects for t,s,t1 These objects are qualified as placeholders and contain only basic info for locking t,s,t1. Then pre-locking for t,s,t1 is performed. The process tries to open t and fails. Also it fails with opening s. Finally it succeeds with opening t1. Then the control comes to the lock_tables(). The function forms an array of pointers to the TABLE structures for t,s,t1. Only the pointer to t1 is not NULL. After this the function mysql_lock_tables() is called. Now we come to the point where the only query of f1 is processed. The function open_tables() is called for the list of TABLE_LIST objects created for t1,s,t,dt that was attached to the query at the parsing stage. For each TABLE_LIST object the function open_and_process_table() is called. When it's called the TABLE_LIST object created for t the function finds out that t refers to cte1 and it's the second reference to cte1. The parser is called the specification of cte1 that inserts afrer TABLE_LIST object for t a new TABLE_LIST object for the second instance of t1. After this open_and_process_table() is called for this instance. The function sees that the instance has not been locked and an error message is reported.

Igor Babaev (Inactive) added a comment - 2021-05-10 17:42 Let's see how exactly we come to a bogus error message for a call of the function: create function f1() returns int return ( with cte1 as ( select c1 from t1) select sum (c1) from ( select * from cte1 as s union all select * from cte1 as t) dt ); The function contains 4 table references with aliases t1, s,t, dt. When the query select f1(); is processed for the first time the function open_and_process_routine() is called that in its turn calls sp_cache_routine() starting parsing of the body of the function f1. The only query of the body is parsed and the chain of 4 TABLE_LIST objects for table references t1,s,t,dt is attached to the query. At the very end of the parsing process for the query the function sp_head::merge_table_list looks through this chain and puts some basic info about the references to t1,s,t into a hash table. When parsing of f1 is finished the control returns to open_and_process_routine() and the function DML_prelocking_strategy::handle_routine is called. The latter invokes the function sp_head::add_used_tables_to_table_list() that looks through the hash table mentioned above and using the info from the elements in table construct a chain of TABLE_LIST objects for t,s,t1 These objects are qualified as placeholders and contain only basic info for locking t,s,t1. Then pre-locking for t,s,t1 is performed. The process tries to open t and fails. Also it fails with opening s. Finally it succeeds with opening t1. Then the control comes to the lock_tables(). The function forms an array of pointers to the TABLE structures for t,s,t1. Only the pointer to t1 is not NULL. After this the function mysql_lock_tables() is called. Now we come to the point where the only query of f1 is processed. The function open_tables() is called for the list of TABLE_LIST objects created for t1,s,t,dt that was attached to the query at the parsing stage. For each TABLE_LIST object the function open_and_process_table() is called. When it's called the TABLE_LIST object created for t the function finds out that t refers to cte1 and it's the second reference to cte1. The parser is called the specification of cte1 that inserts afrer TABLE_LIST object for t a new TABLE_LIST object for the second instance of t1. After this open_and_process_table() is called for this instance. The function sees that the instance has not been locked and an error message is reported.

Igor Babaev (Inactive) made changes - 2021-05-11 06:11

Assignee	Igor Babaev [ igor ]	Oleksandr Byelkin [ sanja ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Oleksandr Byelkin added a comment - 2021-05-21 09:14

OK to push

Oleksandr Byelkin added a comment - 2021-05-21 09:14 OK to push

Oleksandr Byelkin made changes - 2021-05-21 09:14

Assignee	Oleksandr Byelkin [ sanja ]	Igor Babaev [ igor ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Marko Mäkelä added a comment - 2021-05-24 08:21

igor and sanja, this was apparently pushed to 10.2.

I failed to find a 10.3 or 10.4 version of this. On merge to 10.3 I had to resolve numerous conflicts, but the tests passed for me. On an attempted merge to 10.4, I was less lucky, and 3 tests fail. Please fix.

Marko Mäkelä added a comment - 2021-05-24 08:21 igor and sanja , this was apparently pushed to 10.2. I failed to find a 10.3 or 10.4 version of this. On merge to 10.3 I had to resolve numerous conflicts, but the tests passed for me. On an attempted merge to 10.4 , I was less lucky, and 3 tests fail. Please fix.

Marko Mäkelä added a comment - 2021-05-25 14:05

This was pushed separately to 10.4. It looks like some adjustment for 10.5 is needed too.

Marko Mäkelä added a comment - 2021-05-25 14:05 This was pushed separately to 10.4 . It looks like some adjustment for 10.5 is needed too.

Marko Mäkelä added a comment - 2021-05-26 08:26

This was pushed separately to 10.5 and finally merged to 10.6.
A follow-up fix seems to be necessary. A DBUG_ASSERT(false) statement was removed from the following function, but the comment was not adjusted to say when the member function is expected to be invoked:

double Item_in_subselect::val_real()

/*

    As far as Item_in_subselect called only from Item_in_optimizer this

    method should not be used

*/

Marko Mäkelä added a comment - 2021-05-26 08:26 This was pushed separately to 10.5 and finally merged to 10.6 . A follow-up fix seems to be necessary. A DBUG_ASSERT(false) statement was removed from the following function, but the comment was not adjusted to say when the member function is expected to be invoked: double Item_in_subselect::val_real() { /* As far as Item_in_subselect called only from Item_in_optimizer this method should not be used */

Igor Babaev (Inactive) added a comment - 2021-06-08 06:48

A fix for this bug was pushed into 10.2. Then an adjusted fix was pushed into 10.4. Another variant was pushed into 10.5.

Igor Babaev (Inactive) added a comment - 2021-06-08 06:48 A fix for this bug was pushed into 10.2. Then an adjusted fix was pushed into 10.4. Another variant was pushed into 10.5.

Igor Babaev (Inactive) made changes - 2021-06-08 06:48

Fix Version/s		10.6.2 [ 25800 ]
Fix Version/s		10.2.39 [ 25731 ]
Fix Version/s		10.3.30 [ 25732 ]
Fix Version/s		10.4.20 [ 25733 ]
Fix Version/s		10.5.11 [ 25734 ]
Fix Version/s	10.2 [ 14601 ]
Fix Version/s	10.3 [ 22126 ]
Fix Version/s	10.4 [ 22408 ]
Fix Version/s	10.5 [ 23123 ]
Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

Elena Stepanova made changes - 2021-06-09 23:26

Link

This issue relates to ~~MDEV-25601~~ [ ~~MDEV-25601~~ ]

Alice Sherepa made changes - 2021-07-21 16:12

Link

This issue relates to ~~MDEV-26202~~ [ ~~MDEV-26202~~ ]

Alice Sherepa made changes - 2021-10-14 09:31

Link

This issue relates to ~~MDEV-26825~~ [ ~~MDEV-26825~~ ]

Alice Sherepa made changes - 2021-10-14 09:52

Link

This issue relates to ~~MDEV-26825~~ [ ~~MDEV-26825~~ ]

Sergei Golubchik made changes - 2021-12-06 21:52

Workflow

MariaDB v3 [ 114183 ]

MariaDB v4 [ 158437 ]

People

Assignee:: Igor Babaev (Inactive)

Reporter:: Szatox

Votes:: 1 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 2020-10-04 15:42

Updated:: 2021-10-14 09:52

Resolved:: 2021-06-08 06:48

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration