[MDEV-13428] Oracle-compatible recursive queries with CONNECT BY Created: 2017-07-24  Updated: 2020-11-02

Status: Open
Project: MariaDB Server
Component/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Ralf Gebhardt
Resolution: Unresolved Votes: 2
Labels: Compatibility

Issue Links:
PartOf
includes MDEV-13473 CONNECT BY: Does Oracle use depth-fir... Closed
includes MDEV-13479 CONNECT BY: Parser support Open
includes MDEV-13482 CONNECT BY computation algorithm Open
includes MDEV-13502 [DRAFT] Support external references i... Open
includes MDEV-13587 CONNECT BY: Post-parse data structures Open
includes MDEV-13664 CONNECT BY dataset and where clause e... Open
includes MDEV-13681 CONNECT BY to CTE mapping Open

 Description   

Oracle syntax

[ START WITH start_cond ] CONNECT BY [NOCYCLE] connect_cond

Also

  • Expressions in the connect_cond above may use PRIOR expr operator
  • Expressions other than START WITH may use CONNECT_BY_ROOT operator.
  • LEVEL, CONNECT_BY_ISCYCLE, CONNECT_BY_ISLEAF pseudo-columns may be used.
  • SYS_CONNECT_BY_PATH() function may be used

Changes need by the parser

Oracle execution algorithm

See MDEV-13482

How to rewrite an Oracle-style CTE into WITH-style CTE

  SELECT select_list
  FROM table_expr
  WHERE where_cond
  START WITH start_cond
  CONNECT BY connect_expr
  [GROUP BY group_by_expr]

is rewritten into:

  WITH rewritten_cte AS 
  (
    (
      SELECT factored_select_list 
      FROM table_expr
      WHERE <start_cond>
    )
    UNION ALL // TODO: it's a special kind of union that doesn't allow loops.
    (
      SELECT rewritten_cte.* 
      FROM rewritten_cte, table_expr
      WHERE 
        connect_expr
    )
  )
  SELECT compute_select_list_from_factored
  FROM rewritten_cte
  WHERE where_cond
  [GROUP BY group_by_expr]

Resolved Questions

  • Oracle database support both CONNECT-BY recursion and WITH-style recursion, so it should be possible to support both kinds of syntax together.
  • We will need to support ORDER SIBLINGS BY clause

Unresolved Questions

Splitting the WHERE clause

Oracle manual page says:

Oracle processes hierarchical queries as follows:

  • A join, if present, is evaluated first, whether the join is specified in the FROM clause or with WHERE clause predicates.
  • The CONNECT BY condition is evaluated.
  • Any remaining WHERE clause predicates are evaluated.

so one needs to split the WHERE clause into two parts:

  • the ones that "specify the join"
  • the ones that do not and are referred to as "any remaining predicates".
    is this correct?


 Comments   
Comment by Sergei Petrunia [ 2017-08-02 ]

One question is how scalable/performant cycle detection should be.
So I've tried to see how Oracle 12c handles it.

Created a table with 1M (n, n+1) pairs

SQL> create table one_more_one_m (a int, b int);
insert into one_more_one_m select a, a+1 from one_m;

this ran instantly.

Let's start with (1,2) row and run foward towards 1M:

SQL> select count(*), SUM(a) from
(
  select * from one_more_one_m
  start with a=1
  connect by 
   prior b = a
)Q;

This query didn't finish in 4 hours.

Comment by Lixun Peng [ 2017-08-03 ]

ORDER SIBLINGS BY clause is needed, we have found some users are using it.

Comment by Sergei Petrunia [ 2017-08-12 ]

An interesting observation re what does NOCYCLE really mean: https://lists.launchpad.net/maria-developers/msg10840.html . Does that have any impact on the execution algorithm? It's not clear, yet.

Generated at Thu Feb 08 08:05:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.