[MDEV-13473] CONNECT BY: Does Oracle use depth-first or breadth-first traversal? Created: 2017-08-08  Updated: 2017-08-14  Resolved: 2017-08-12

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None

Issue Links:
PartOf
is part of MDEV-13428 Oracle-compatible recursive queries w... Open

 Description   

A Followup to slack/zoom discussions:

  • Does Oracle use depth-first or breadth-first traversal?
  • Does it matter which we should use?

Here's one way to check it: let's make a "logging function":

create table tlog2(a int, num int);
insert into tlog2 values (0,0);
 
CREATE OR REPLACE
FUNCTION f3
( param1 IN NUMBER
) RETURN NUMBER AS
BEGIN
  insert into tlog2 values((select 1+ max(a) from tlog2), param1);
  RETURN param1;
END f3;
/

sample dataset:

create table employees (
  employee_id NUMBER,
  last_name varchar(32),
  manager_id  NUMBER
);
 
insert into employees values (201, 'team-red-freshman-1',181);
insert into employees values (202, 'team-blue-freshman-1', 182);
 
insert into employees values (181, 'team-red-l1', 161);
insert into employees values (182, 'team-blue-11', 162);
 
insert into employees values (161, 'team-red-l2', 141);
insert into employees values (162, 'team-blue-12', 142);
 
insert into employees values (141, 'team-red-l3', 121);
insert into employees values (142, 'team-blue-13', 122);
 
insert into employees values (121, 'team-red-l4', 100);
insert into employees values (122, 'team-blue-14', 100);
 
insert into employees values (100, 'ceo', 99);

Oracle won't allow stored function to do DML if its a SELECT, so make it INSERT...SELECT:

create table tdump3(
 col1 number,
 col2 varchar(32),
 col3 number
);

The query:

insert into tdump3
SELECT employee_id, last_name, manager_id
   FROM employees
   START WITH employee_id IN (201,202)
   CONNECT BY f3(PRIOR manager_id) = employee_id;



 Comments   
Comment by Sergei Petrunia [ 2017-08-08 ]

SQL> insert into tdump3
SELECT employee_id, last_name, manager_id
   FROM employees
   START WITH employee_id IN (201,202)
   CONNECT BY f3(PRIOR manager_id) = employee_id;
  2    3    4    5  
12 rows created.

SQL> select * from tdump3;
 
      COL1 COL2                                   COL3
---------- -------------------------------- ----------
       201 team-red-freshman-1                     181
       181 team-red-l1                             161
       161 team-red-l2                             141
       141 team-red-l3                             121
       121 team-red-l4                             100
       100 ceo                                      99
       202 team-blue-freshman-1                    182
       182 team-blue-11                            162
       162 team-blue-12                            142
       142 team-blue-13                            122
       122 team-blue-14                            100
       100 ceo                                      99
 
12 rows selected.

Rows come in depth-first order, just as documentation says.

Comment by Sergei Petrunia [ 2017-08-08 ]

Let's examine "the log":

SQL> select * from tlog2 order by a;
 
         A        NUM
---------- ----------
         0          0
         1        181
         2        181
         3        181
         4        161
         5        161
         6        161
         7        161
         8        161
         9        141
        10        141
 
         A        NUM
---------- ----------
        11        141
        12        141
        13        141
        14        141
        15        141
        16        121
        17        121
        18        121
        19        121
        20        121
        21        121
 
         A        NUM
---------- ----------
        22        121
        23        121
        24        121
        25        100
        26        100
        27        100
        28        100
        29        100
        30        100
        31        100
        32        100
 
         A        NUM
---------- ----------
        33        100
        34        100
        35        100
        36         99
        37         99
        38         99
        39         99
        40         99
        41         99
        42         99
        43         99

So it first started from team-red-freshman and went up all the way to the CEO.
I am not sure why it makes so many calls to the stored function.

 
         A        NUM
---------- ----------
        44         99
        45         99
        46         99
        47        121
        48        121
        49        141
        50        141
        51        141
        52        141
        53        161
        54        161
 
         A        NUM
---------- ----------
        55        161
        56        161
        57        161
        58        161
        59        181
        60        181
        61        181
        62        181
        63        181
        64        181
        65        181
 
         A        NUM
---------- ----------
        66        181

This was a kind of back tracking where it went back down and looked if there
are any more matches?

Anyhow, note that it was all team-red so far (number end with *1).

Now it switches to doing the same to team-blue:

 
        67        182
        68        182
        69        182
        70        182
        71        162
        72        162
        73        162
        74        162
        75        162
        76        162
 
         A        NUM
---------- ----------
        77        142
        78        142
        79        142
        80        142
        81        142
        82        142
        83        142
        84        142
        85        122
        86        122
        87        122
 
         A        NUM
---------- ----------
        88        122
        89        122
        90        122
        91        122
        92        122
        93        122
        94        122
        95        100
        96        100
        97        100
        98        100
 
         A        NUM
---------- ----------
        99        100
       100        100
       101        100
       102        100
       103        100
       104        100
       105        100
       106         99
       107         99
       108         99
       109         99
 
         A        NUM
---------- ----------
       110         99
       111         99
       112         99
       113         99
       114         99
       115         99
       116         99
       117        122
       118        142
       119        142
       120        142
 
         A        NUM
---------- ----------
       121        162
       122        162
       123        162
       124        162
       125        162
       126        182
       127        182
       128        182
       129        182
       130        182
       131        182
 
         A        NUM
---------- ----------
       132        182
 
133 rows selected.

Comment by Sergei Petrunia [ 2017-08-08 ]

EXPLAIN:

SQL> explain plan for insert into tdump3
SELECT employee_id, last_name, manager_id
   FROM employees
   START WITH employee_id IN (201,202)
   CONNECT BY f3(PRIOR manager_id) = employee_id;  2    3    4    5  
 
Explained.

SQL> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1986864582
 
--------------------------------------------------------------------------------
----------------------
 
| Id  | Operation                                | Name      | Rows  | Bytes | C
ost (%CPU)| Time     |
 
--------------------------------------------------------------------------------
----------------------
 
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                         |           |     4 |   176 |
   3  (34)| 00:00:01 |
 
|   1 |  LOAD TABLE CONVENTIONAL                 | TDUMP3    |       |       |
          |          |
 
|*  2 |   CONNECT BY NO FILTERING WITH START-WITH|           |       |       |
          |          |
 
|   3 |    TABLE ACCESS FULL                     | EMPLOYEES |    11 |   484 |
   2   (0)| 00:00:01 |
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
 
--------------------------------------------------------------------------------
----------------------
 
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("EMPLOYEE_ID"="F3"(PRIOR "MANAGER_ID") AND ("EMPLOYEE_ID"=201 OR
              "EMPLOYEE_ID"=202))
 
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
 
20 rows selected.

Comment by jametong [ 2017-08-14 ]

As I tested the case with a 25k rows,the results is depth first.

SQL> l
1 select dbms_rowid.rowid_block_number(rowid) block_no,lpad(' ', 2 * level - 2, ' ') || id as ename
2 , level
3 , parent_id
4 , name
5 , sys_connect_by_path(id,'/') as arch
6 from test
7 connect by parent_id = prior id
8* start with id = 9
SQL> /
20

BLOCK_NO ENAME LEVEL PARENT_ID NAME ARCH
---------- ------------------------------ ---------- ---------- ------------------------------ ----------------------------------------
138 9 1 0 name_9 /9
693 90 2 9 name_90 /9/90
2569 900 3 90 name_900 /9/90/900
18769 9000 4 900 name_9000 /9/90/900/9000
18770 9001 4 900 name_9001 /9/90/900/9001
18771 9002 4 900 name_9002 /9/90/900/9002
18772 9003 4 900 name_9003 /9/90/900/9003
18773 9004 4 900 name_9004 /9/90/900/9004
18774 9005 4 900 name_9005 /9/90/900/9005
18775 9006 4 900 name_9006 /9/90/900/9006
18776 9007 4 900 name_9007 /9/90/900/9007
18777 9008 4 900 name_9008 /9/90/900/9008
18778 9009 4 900 name_9009 /9/90/900/9009
18779 9010 4 900 name_9010 /9/90/900/9010
18780 9011 4 900 name_9011 /9/90/900/9011
18781 9012 4 900 name_9012 /9/90/900/9012
18782 9013 4 900 name_9013 /9/90/900/9013
18783 9014 4 900 name_9014 /9/90/900/9014
18784 9015 4 900 name_9015 /9/90/900/9015
18785 9016 4 900 name_9016 /9/90/900/9016
18786 9017 4 900 name_9017 /9/90/900/9017
18787 9018 4 900 name_9018 /9/90/900/9018
18788 9019 4 900 name_9019 /9/90/900/9019
2570 901 3 90 name_901 /9/90/901
18789 9010 4 901 name_9010 /9/90/901/9010
18790 9011 4 901 name_9011 /9/90/901/9011
18791 9012 4 901 name_9012 /9/90/901/9012
18792 9013 4 901 name_9013 /9/90/901/9013
18793 9014 4 901 name_9014 /9/90/901/9014
18794 9015 4 901 name_9015 /9/90/901/9015
18795 9016 4 901 name_9016 /9/90/901/9016
18796 9017 4 901 name_9017 /9/90/901/9017
18797 9018 4 901 name_9018 /9/90/901/9018
18798 9019 4 901 name_9019 /9/90/901/9019
18799 9020 4 901 name_9020 /9/90/901/9020
18800 9021 4 901 name_9021 /9/90/901/9021
18801 9022 4 901 name_9022 /9/90/901/9022
18802 9023 4 901 name_9023 /9/90/901/9023
18803 9024 4 901 name_9024 /9/90/901/9024
18804 9025 4 901 name_9025 /9/90/901/9025
18805 9026 4 901 name_9026 /9/90/901/9026
18806 9027 4 901 name_9027 /9/90/901/9027
18807 9028 4 901 name_9028 /9/90/901/9028
18808 9029 4 901 name_9029 /9/90/901/9029
2571 902 3 90 name_902 /9/90/902

Generated at Thu Feb 08 08:05:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.