Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20632

Recursive CTE cycle detection using CYCLE clause

    XMLWordPrintable

    Details

      Description

      One can use UNION DISTINCT as an easy way of avoiding cycles when traversing a graph with a CTE:

      WITH RECURSIVE cte (from_, to_) (
         SELECT 1,1
         UNION DISTINCT
         SELECT graph.from_, graph.to_ FROM graph, cte WHERE graph.from_ = cte.to_
      ) ...
      

      But often one needs to know more than edges, for example

      WITH RECURSIVE cte (depth, from_, to_) (
         SELECT 0,1,1
         UNION DISTINCT
         SELECT depth+1, graph.from_, graph.to_ FROM graph, cte WHERE graph.from_ = cte.to_
      ) ...
      

      and here DISTINCT no longer works.


      SQL Standard specifies that a CTE can have a CYCLE clause as

      WITH RECURSIVE ... (
        ...
      )
      CYCLE <cycle column list>
      SET <cycle mark column> TO <cycle mark value> DEFAULT <non-cycle mark value>
      USING <path column>
      

      where

      • <cycle column list> is a subset of columns that the CTE returns
      • <cycle mark column> is a new column, generated on the fly, its value for any particular row being <cycle mark value> if there's a cycle and <non-cycle mark value> if there's no cycle
      • <path column> is an ARRAY where the path is being accumulated

      While in the standard all clauses in the CYCLE are mandatory, we'll relax this grammar to allow only CYCLE <cycle column list>.


      This task is about implementing optional CYCLE <cycle column list> clause after the recursive CTE definition.

      There is a simple way to implement it by changing CTE's UNION ALL or UNION DISTINCT operator to enforce distinct-ness only over <cycle column list> columns, not over all columns that CTE returns.

      The example from above would look like

      WITH RECURSIVE cte (depth, from_, to_) (
         SELECT 0,1,1
         UNION
         SELECT depth+1, graph.from_, graph.to_ FROM graph, cte WHERE graph.from_ = cte.to_
      ) CYCLE from_, to_
      ...
      

      Note that it doesn't matter whether the CTE uses UNION ALL or UNION DISTINCT anymore. UNION ALL means "all rows, but without cycles", which is exactly what we'll do. And UNION DISTINCT means all rows should be different, which, again, is what will happen — as we'll enforce uniqueness over a subset of columns, complete rows will automatically be all different.

        Attachments

          Activity

            People

            Assignee:
            sanja Oleksandr Byelkin
            Reporter:
            serg Sergei Golubchik
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Dates

              Created:
              Updated: