Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10141

Add support for INTERSECT (and common parts for EXCEPT)

    XMLWordPrintable

Details

    Description

      The INTERSECT/EXCEPT clause has this general form:

      select_statement INTERSECT select_statement

      select_statement is any SELECT statement without an ORDER BY, LIMIT, or FOR UPDATE clause.

      The INTERSECT operator computes the set intersection of the rows returned by the involved SELECT statements. A row is in the intersection of two result sets if it appears in both result sets.

      The result of INTERSECT does not contain any duplicate rows unless the ALL option is specified. With ALL, a row that has m duplicates in the left table and n duplicates in the right table will appear min(m,n) times in the result set.

      Steps to implement INTERSECT/EXCEPT

      1. Functionality
      1.1. Basic (UNION LIKE) functionality with ORDER BY, GROUP BY, LIMIT etc. (no ALL, no recursive CTE)
      1.2. ALL clause
      1.3. Recursive CTE support
      2. Optimizations
      2.1 Conversion to nested JOIN
      2.2 Optimize unique index

      1.1. Basic (UNION LIKE) functionality with ORDER BY, GROUP BY, LIMIT etc. (no ALL, no recursive CTE)

      Main idea:

      First query works like with UNION (distinct) select_union (inherited from select_result) results interceptor and collect data in a temporary table with an unique index.

      Second (and so on) has its own result interceptor

      • EXCEPT
        a) just delete records from the table as soon as finds duplicate.
        b) (by Sergei Petrunia) make table of second select, then sent record of first not found in the table of second.
      • INTERSECT
        a) has its own table (Igor's idea to have 2 tables and switch them if there is more then 2 SELECTs) and pass through only "duplicate" records (in case of 2 SELECTs it can just return that rows).
        b) (Final) Add 1 hidden field and write there number of stage (0,1,2 and so on). At the beginning we write 0 when first select collects data. Next stage number N will be written only if we already have there N-1 in the found row (if no then record can be deleted). Finally records marked with last N is the result. If no record was marked during iteration we should stop and return empty set.

      UNION/EXCEPT/INTERSECT has different priority and they can be put on one level of UNIT/SELECT tree (SELECT_LEX/SELECT_LEX_UNIT) only in case if execution corresponds order. For beginner I think better to divide levels. Later (as optimization) possible to "join" levels in case sequential execution order and eliminate so called "fake_select" if results can be passed without processing.

      1.2. ALL clause

      In basic case can we build non unique index and work with all records which fit the same way as in 1.1.

      1.3. Recursive CTE support

      2.1 Conversion to nested JOIN

      In many cases (when there is no grouping operations) the operation can be put in one SELECT with joining all records (better when there is unique indices from both sides)

      2.2 Optimize unique index

      There is cases when we know that there is some unique set of field (also can work for UNION but need the same unique set from the other end)
      a) PRIMARY/UNIQUE index in source table, or tables if TABLE1.UNIQUE_FIELD = TABLE2.UNIQUE_FIELD used
      b) Result of GROUP BY is unique
      It is also can be used for JOIN of 2.1 step.

      In conversation with Sergei Petrunia we got idea of "oppening" brackets with not standart operations:

      S1 intersect (S2 union S3) can be unrolled to:
      1) add S1 records with marker 0
      2) find S2 records with marker 0 and put marker 2
      3) find S3 records with marker 0 and put marker 2
      3.1) filter tmp table with marker 2

      S1 except (S2 except S2) :
      1) add S1 records with marker 0
      2) find S2 records with marker 0 and put marker 2
      3) find S3 records with marker 2 and put marker 0
      3.1) filter tmp table with marker not 2

      ...

      The question is: is it possible all brackets "open" in such way?

      Attachments

        Issue Links

          Activity

            People

              sanja Oleksandr Byelkin
              monty Michael Widenius
              Votes:
              2 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.