[MDEV-29361] Infinite recursive calls when detecting CTE dependencies Created: 2022-08-23  Updated: 2022-12-19  Resolved: 2022-09-29

Status: Closed
Project: MariaDB Server
Component/s: Optimizer - CTE
Affects Version/s: 10.10.0, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10
Fix Version/s: 10.3.37, 10.4.27, 10.5.18, 10.6.11, 10.7.7, 10.8.6, 10.9.4, 10.10.2

Type: Bug Priority: Critical
Reporter: Shihao Wen Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: None

Attachments: HTML File 313_stack    
Issue Links:
Duplicate
is duplicated by MDEV-29358 Server crashed with stack-overflow in... Closed
Problem/Incident
causes MDEV-30248 Infinite sequence of recursive calls ... Closed
Relates
relates to MDEV-10737 Server falls into endless loop in st_... Closed
relates to MDEV-26095 Infinite recursion when processing em... Closed
relates to MDEV-29358 Server crashed with stack-overflow in... Closed

 Description   

output:

SUMMARY: AddressSanitizer: stack-overflow (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x51575)

poc:

CREATE TABLE x ( x BIGINT ) ;
 INSERT INTO x ( x ) VALUES ( 1 ) ;
 UPDATE x SET x = 1 WHERE x = 1 ;
 INSERT INTO x ( x ) VALUES ( 1.000000 ) , ( 1 ) ;
 WITH x AS ( WITH x AS ( SELECT ( x % ( WITH x AS ( SELECT x FROM ( SELECT x FROM x WHERE x = CASE WHEN x * ( SELECT 1 FROM x AS x WHERE x BETWEEN 1.000000 AND 1 WINDOW x AS ( PARTITION BY x ORDER BY ( SELECT x FROM x x HAVING ( TRUE IN ( CASE x WHEN x THEN 'x' ELSE TRUE END != ( ( ( x OR NOT x ) BETWEEN 1 AND 1 ) ) ) ) ) DESC RANGE BETWEEN 1.000000 FOLLOWING AND 1.000000 FOLLOWING ) ) ^ x THEN 'x' ELSE x END / 1 GROUP BY x ) AS x ) SELECT EXISTS ( WITH RECURSIVE x ( x ) AS ( SELECT 1 UNION SELECT 1 - x FROM x LIMIT 1 ) SELECT DISTINCT ( ( NOT ( 1.000000 AND x = 1 ) ) = 1 AND x = 1 ) % 1 , ( x = 1 OR x > FALSE ) WHERE x = 1 AND ( x = 1 OR x = 1 OR x = 1 ) ) , 'x' FROM x WINDOW x AS ( PARTITION BY x ORDER BY x DESC ) ) <= x ) , 1 FROM x ) SELECT x FROM ( SELECT x FROM x GROUP BY x ) AS x ) SELECT x FROM x WHERE x BETWEEN FALSE AND ( ( ( x OR NOT x ) BETWEEN ( ( ( NOT ( ( 1.000000 ^ 1.000000 AND ( ( TRUE , x ) NOT IN ( SELECT ( NOT ( x = CASE 'x' = 'x' WHEN 'x' THEN 'x' WHEN 1 THEN 'x' ELSE 1 END / 1 ) ) , 1 FROM x ) OR x > 'x' ) = 1 ) * NULL ) ) ) ) AND 1.000000 ) ) ;



 Comments   
Comment by Alice Sherepa [ 2022-08-24 ]

Thank you!
I repeated on 10.3-10.10. Also crashes non-debug version, but there is nothing in the error log then. Test case idea similar to MDEV-29358
Please check the initial test case after the fix!

 
CREATE TABLE x (x int);
INSERT INTO x VALUES (1),(2),(3);
 
WITH x AS ( 
WITH x AS ( 
	SELECT ( WITH x AS  ( SELECT ( SELECT 1 FROM x ) FROM x ) SELECT EXISTS ( WITH RECURSIVE x AS ( SELECT 1 FROM x) SELECT 1 )) 
	FROM x) 
SELECT 1 FROM x) 
SELECT ( SELECT x from x ) FROM x;

10.3 c7f8cfc9e733517cff4aaa6f

...................
#3490 0x00005597b594c36c in LEX::resolve_references_to_cte (this=0x62d0001ae490, tables=0x62d0001afc10, tables_last=0x62d0001afc18) at /10.3/src/sql/sql_cte.cc:238
#3491 0x00005597b594faa4 in With_element::clone_parsed_spec (this=0x62900013a018, old_lex=0x629000141de0, with_table=0x6290001436a8) at /10.3/src/sql/sql_cte.cc:1109
#3492 0x00005597b594c36c in LEX::resolve_references_to_cte (this=0x629000141de0, tables=0x6290001436a8, tables_last=0x6290001436b0) at /10.3/src/sql/sql_cte.cc:238
#3493 0x00005597b594faa4 in With_element::clone_parsed_spec (this=0x62900013a018, old_lex=0x629000131a88, with_table=0x629000137eb0) at /10.3/src/sql/sql_cte.cc:1109
#3494 0x00005597b594c36c in LEX::resolve_references_to_cte (this=0x629000131a88, tables=0x629000137608, tables_last=0x629000141768) at /10.3/src/sql/sql_cte.cc:238
#3495 0x00005597b594faa4 in With_element::clone_parsed_spec (this=0x62900012ebf0, old_lex=0x62a0000be060, with_table=0x6290001313f0) at /10.3/src/sql/sql_cte.cc:1109
#3496 0x00005597b594c36c in LEX::resolve_references_to_cte (this=0x62a0000be060, tables=0x62b0000043f8, tables_last=0x6290001313f8) at /10.3/src/sql/sql_cte.cc:238
#3497 0x00005597b594c6b1 in LEX::check_cte_dependencies_and_resolve_references (this=0x62a0000be060) at /10.3/src/sql/sql_cte.cc:280
#3498 0x00005597b59f2bc6 in MYSQLparse (thd=0x62a0000ba270) at /10.3/src/sql/sql_yacc.yy:9257
#3499 0x00005597b543110d in parse_sql (thd=0x62a0000ba270, parser_state=0x7f8232a13860, creation_ctx=0x0, do_pfs_digest=true) at /10.3/src/sql/sql_parse.cc:10204
#3500 0x00005597b5421fc8 in mysql_parse (thd=0x62a0000ba270, rawbuf=0x62b000000290 "WITH x AS \n( \nWITH x AS \n( SELECT ( WITH x AS  ( SELECT ( SELECT 1 FROM x ) FROM x ) \nSELECT EXISTS ( WITH RECURSIVE x AS ( SELECT 1 FROM x", length=217, parser_state=0x7f8232a13860, is_com_multi=false, is_next_command=false) at /10.3/src/sql/sql_parse.cc:7823
#3501 0x00005597b53f9317 in dispatch_command (command=COM_QUERY, thd=0x62a0000ba270, packet=0x629000127271 "WITH x AS \n( \nWITH x AS \n( SELECT ( WITH x AS  ( SELECT ( SELECT 1 FROM x ) FROM x ) \nSELECT EXISTS ( WITH RECURSIVE x AS ( SELECT 1 FROM x) SELECT 1 )) \nFROM x \n) \nSELECT 1 FROM x\n) \nSELECT ( SELECT "..., packet_length=217, is_com_multi=false, is_next_command=false) at /10.3/src/sql/sql_parse.cc:1852
#3502 0x00005597b53f5e5a in do_command (thd=0x62a0000ba270) at /10.3/src/sql/sql_parse.cc:1398
#3503 0x00005597b57c9ee5 in do_handle_one_connection (connect=0x61100004def0) at /10.3/src/sql/sql_connect.cc:1403
#3504 0x00005597b57c979f in handle_one_connection (arg=0x61100004def0) at /10.3/src/sql/sql_connect.cc:1308
#3505 0x00005597b6dfaa17 in pfs_spawn_thread (arg=0x61600000e1f0) at /10.3/src/storage/perfschema/pfs.cc:1869
#3506 0x00007f824924c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#3507 0x00007f8249171133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Igor Babaev [ 2022-08-31 ]

Let's have such tables:

create table t1 (a int);
insert into t1 values (4), (5);
create table t2 (a int);
insert into t2 values (6), (8);
create table t3 (a int);
insert into t3 values (1), (9);
create table x (a int);
insert into x values (3), (7), (1), (5), (6);

Let's consider a query with the structure quite similar to the structure of the query in the previous comment that causes the same kind of problem.

with
cte_e as 
( 
  with
  cte as
  ( 
    select ( 
             with
             x as
             (
               select ( select a from x ) from t1
             )
             select exists ( 
                             with recursive
                             x as
                             (
                               select a from t2
                               union
                               select a+2 from x as r2 where a < 10
                             )
                             select a from x
                           ) 
           ) as r
    from t3
  )
  select * from cte
)
select s1.*, s2.*  from cte_e as s1, cte_e as s2;

The problem appears when LEX::check_cte_dependencies_and_resolve_references() is called for the second usage of cte_e aliased as s2. The function first detects dependencies for CTEs used by s2. Here the reference to x in the definition of the second CTE with name x is erroneously bound to the embedding CTE with this name. As a result the former is not detected as recursive CTE. This leads to an infinite sequence of recursive calls of the functions With_element::clone_parsed_spec() and LEX::check_cte_dependencies_and_resolve_references() for the second CTE with name x.

Comment by Igor Babaev [ 2022-08-31 ]

If we remove the second reference to cte1 executing the query

with
cte_e as 
( 
  with
  cte as
  ( 
    select ( 
             with
             x as
             (
               select ( select a from x ) as a from t1
             )
             select exists ( 
                             with recursive
                             x as
                             (
                               select a from t2
                               union
                               select a+2 from x as r2 where a < 10
                             )
                             select a from x
                           ) 
           ) as r
    from t3
  )
  select * from cte
)
select * from cte_e;

we get an error message

ERROR 1242 (21000): Subquery returns more than 1 row

Note that this message concerns the subquery ( select a from x ) used in the first CTE with name x. Yet this CTE is not used anywhere.
An analysis in debugger shows that this happens to a wrong binding of the recursive reference to the second CTE with name x.
The same result we have with the query

with
cte as
( 
  select ( 
           with
           x as
           (
             select ( select a from x ) as a from t1
           )
           select exists ( 
                           with recursive
                           x as
                           (
                             select a from t2
                             union
                             select a+2 from x as r2 where a < 10
                           )
                           select a from x
                         ) 
         ) as r
  from t3
)
select * from cte;

Comment by Igor Babaev [ 2022-08-31 ]

In the above queries the first CTE with name x is not used. In this query it is used:

with
cte as
( 
  select ( 
           with
           x as
           (
             select ( select a from x as r1 ) as a from t1
           )
           select * from x as s1
           where s1.a in ( 
                           with recursive
                           x as
                           (
                             select a from t2
                             union
                             select a+2 from x as r2 where a < 10
                           )
                           select a from x s2
                         ) 
         ) as r
  from t3
)
select * from cte;

Here we have to an infinite sequence of recursive calls of the functions With_element::clone_parsed_spec() and LEX::check_cte_dependencies_and_resolve_references()
for the second CTE with name x.
The same we have for the query

with
cte as
( 
  select ( 
           with
           x as
           (
             select a from x as r1
           )
           select * from x as s1
           where s1.a in ( 
                          with
                          recursive x as
                          (
                            select a from t2
                            union
                            select a+2 from x as r2 where a < 10
                          )
                          select a from x as s2
                        ) 
         ) as r
  from t3
)
select * from cte;

Comment by Igor Babaev [ 2022-08-31 ]

Let's drop table x

drop table x;

and use a recursive CTE as the first CTE with name x

with
cte as
( 
  select ( 
           with recursive
           x(a) as
           (
             select a+3 from t1 union select a+1 from x as r1 where a < 7 
           )
           select * from x as s1
           where s1.a < 8 and 
                 s1.a in ( 
                           with recursive
                           x(a) as
                           (
                             select a-2 from t2
                             union
                             select a+1 from x as r2 where a < 10
                           )
                           select a from x as s2
                         ) 
         ) as r
  from t3
)
select * from cte;

For this query we have:

MariaDB [test]> with
    -> cte as
    -> ( 
    ->   select ( 
    ->            with recursive
    ->            x(a) as
    ->            (
    ->              select a+3 from t1 union select a+1 from x as r1 where a < 7 
    ->            )
    ->            select * from x as s1
    ->            where s1.a < 8 and 
    ->                  s1.a in ( 
    ->                            with recursive
    ->                            x(a) as
    ->                            (
    ->                              select a-2 from t2
    ->                              union
    ->                              select a+1 from x as r2 where a < 10
    ->                            )
    ->                            select a from x as s2
    ->                          ) 
    ->          ) as r
    ->   from t3
    -> )
    -> select * from cte;
+------+
| r    |
+------+
| NULL |
| NULL |
+------+

though we have:

MariaDB [test]>            with recursive
    ->            x(a) as
    ->            (
    ->              select a+3 from t1 union select a+1 from x as r1 where a < 7 
    ->            )
    ->            select * from x as s1
    -> ;
+------+
| a    |
+------+
|    7 |
|    8 |
+------+
 
MariaDB [test]>                            with recursive
    ->                            x(a) as
    ->                            (
    ->                              select a-2 from t2
    ->                              union
    ->                              select a+1 from x as r2 where a < 10
    ->                            )
    ->                            select a from x as s2
    -> ;
+------+
| a    |
+------+
|    4 |
|    6 |
|    5 |
|    7 |
|    8 |
|    9 |
|   10 |
+------+

Apparently the result of the query is not correct

Comment by Igor Babaev [ 2022-08-31 ]

For the query

with
cte as
( 
  with recursive
  x as
  (
    select a from t1 union select a+1 from x as r1 where a < 7 
  )
  select * from x as s1
  where s1.a in ( 
                 with recursive
                 x as
                 (
                   select a from t2
                   union
                   select a+2 from x as r2 where a < 10
                 )
                 select a from x as s2 
               )         
)
select * from cte;

we have

MariaDB [test]> with
    -> cte as
    -> ( 
    ->   with recursive
    ->   x as
    ->   (
    ->     select a from t1 union select a+1 from x as r1 where a < 7 
    ->   )
    ->   select * from x as s1
    ->   where s1.a in ( 
    ->                  with recursive
    ->                  x as
    ->                  (
    ->                    select a from t2
    ->                    union
    ->                    select a+2 from x as r2 where a < 10
    ->                  )
    ->                  select a from x as s2 
    ->                )         
    -> )
    -> select * from cte;
+------+
| a    |
+------+
|    6 |
|    7 |
+------+

This is also not correct.
If for the first CTE with name x we omit specifier RECURSIVE we get an infinite sequence of recursive calls of the functions With_element::clone_parsed_spec() and LEX::check_cte_dependencies_and_resolve_references() instead of a message that table x does not exist.
The result for this query is correct

  with recursive
  x as
  (
    select a from t1 union select a+1 from x as r1 where a < 7 
  )
  select * from x as s1
  where s1.a in ( 
                 with recursive
                 x as
                 (
                   select a from t2
                   union
                   select a+2 from x as r2 where a < 10
                 )
                 select a from x as s2 
               )         
;

MariaDB [test]>   with recursive
    ->   x as
    ->   (
    ->     select a from t1 union select a+1 from x as r1 where a < 7 
    ->   )
    ->   select * from x as s1
    ->   where s1.a in ( 
    ->                  with recursive
    ->                  x as
    ->                  (
    ->                    select a from t2
    ->                    union
    ->                    select a+2 from x as r2 where a < 10
    ->                  )
    ->                  select a from x as s2 
    ->                )         
    -> ;
+------+
| a    |
+------+
|    6 |
+------+

Comment by Igor Babaev [ 2022-09-02 ]

The following query

with
cte as
( 
  with recursive
  x as
  (
    select a from t1 union select a+1 from x as r1 where a < 7 
  )
  select * from x as s1
  where s1.a in ( 
                 with
                 x as
                 (
                   select a from t2
                   union
                   select a+2 from x as r2 where a < 10
                 )
                 select a from x as s2 
               )         
)
select * from cte;

causes a crash with the following stack:

sql/sql_cte.h:225(With_element::get_name())[0x555555eae990]
sql/sql_cte.h:226(With_element::get_name_str())[0x555555eae9b4]
sql/sql_cte.cc:437(With_clause::find_table_def(TABLE_LIST*, With_element*))[0x555555eac7af]
sql/sql_cte.cc:480(find_table_def_in_with_clauses(TABLE_LIST*, st_unit_ctxt_elem*))[0x555555eac88d]
sql/sql_cte.cc:542(With_element::check_dependencies_in_select(st_select_lex*, st_unit_ctxt_elem*, bool, unsigned long long*))[0x555555eaca0a]
sql/sql_cte.cc:651(With_element::check_dependencies_in_unit(st_select_lex_unit*, st_unit_ctxt_elem*, bool, unsigned long long*))[0x555555eacd44]
sql/sql_cte.cc:685(With_element::check_dependencies_in_with_clause(With_clause*, st_unit_ctxt_elem*, bool, unsigned long long*))[0x555555eacdaf]
sql/sql_cte.cc:648(With_element::check_dependencies_in_unit(st_select_lex_unit*, st_unit_ctxt_elem*, bool, unsigned long long*))[0x555555eacce1]
sql/sql_cte.cc:565(With_element::check_dependencies_in_select(st_select_lex*, st_unit_ctxt_elem*, bool, unsigned long long*))[0x555555eacb2d]
sql/sql_cte.cc:406(With_element::check_dependencies_in_spec())[0x555555eac71e]
sql/sql_cte.cc:334(With_clause::check_dependencies())[0x555555eac563]
sql/sql_cte.cc:84(LEX::check_dependencies_in_with_clauses())[0x555555eabead]
sql/sql_cte.cc:276(LEX::check_cte_dependencies_and_resolve_references())[0x555555eac40c]
sql/sql_yacc.yy:9257(MYSQLparse(THD*))[0x555555ef7e9d]
sql/sql_parse.cc:10204(parse_sql(THD*, Parser_state*, Object_creation_ctx*, bool))[0x555555c97a7e]
sql/sql_parse.cc:7823(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x555555c9209c]
sql/sql_parse.cc:1855(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x555555c7ec4a]
sql/sql_parse.cc:1399(do_command(THD*))[0x555555c7d53c]
sql/sql_connect.cc:1403(do_handle_one_connection(CONNECT*))[0x555555dfdd54]
sql/sql_connect.cc:1309(handle_one_connection)[0x555555dfdac3]

Note that in this query x as r2 is supposed to be resolved against the first CTE with name x.
A crash with the same stack is caused by the query

with
cte as
( 
  with
  x as
  (
    select a from t1 union select a+1 from x as r1 where a < 7 
  )
  select * from x as s1
  where s1.a in ( 
                 with
                 x as
                 (
                   select a from t2
                   union
                   select a+2 from x as r2 where a < 10
                 )
                 select a from x as s2 
               )   
)
select * from cte;

Here neither x as r1 nor x as r2 could be resolved because both CTEs with name x belong to non-RECURSIVE with clauses.

Comment by Igor Babaev [ 2022-09-03 ]

The following two queries where the first embedded CTE has the name different from the name of the second CTE cause crashes with the same stack as the queries in the previous comment.

with
cte as
( 
  with recursive
  y as
  (
    select a from t1 union select a+1 from y as r1 where a < 7 
  )
  select * from y as s1
  where s1.a in ( 
                 with
                 x as
                 (
                   select a from t2
                   union
                   select a+2 from x as r2 where a < 10
                 )
                 select a from x as s2 
               )         
)
select * from cte;
 
with
cte as
( 
  with
  y(a) as
  (
    select a+5 from t1 
  )
  select * from y as s1
  where s1.a in ( 
                 with
                 x as
                 (
                   select a from t2
                   union
                   select a+2 from x as r2 where a < 10
                 )
                 select a from x as s2 
               )         
)
select * from cte;

Note that in both queries CTE with name x is defined in with clauses without RECURSIVE specifier. So x as r2 cannot be resolved as a reference to CTE and the message

ERROR 42S02: Table 'test.x' doesn't exist

is expected.
If we add this specifier to the with clauses containing the CTE with name x then this CTE is taken as a recursive CTE. In this case the server returns expected result sets.

Comment by Oleksandr Byelkin [ 2022-09-28 ]

OK to push.

Comment by Igor Babaev [ 2022-09-29 ]

After the fix the reported query returns a syntax error. The query suggested by alice returns the following error:

MariaDB [test]> WITH x AS ( 
    -> 
    -> WITH x AS ( 
    -> 
    -> SELECT ( WITH x AS  ( SELECT ( SELECT 1 FROM x ) FROM x ) SELECT EXISTS ( WITH RECURSIVE x AS ( SELECT 1 FROM x) SELECT 1 )) 
    -> 
    -> FROM x) 
    -> 
    -> SELECT 1 FROM x) 
    -> 
    -> SELECT ( SELECT x from x ) FROM x;
ERROR 4005 (HY000): No anchors for recursive WITH element 'x'

that makes a perfect sense.

Comment by Igor Babaev [ 2022-09-29 ]

A fix for the bug was pushed into 10.3. it has to be merged upstream as it is.

Generated at Thu Feb 08 10:07:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.