Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4776

Create initial architectural component for query rewrite and implement 2 use cases

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 22.08
    • Component/s: None
    • Labels:

      Description

      We appear to lack a usual component of database optimization - an ability to internally transform (rewrite) queries into the shape best suited for the specifics of the database engine. Without it, a number of commercially critical query shapes cannot be executed in Columnstore, including 2 from TPC-H (3 queries fail to parse as a result).

      Type 1. Certain correlated subqueries.

      MariaDB [dbt3]> drop table if exists t1,t2;
      Query OK, 0 rows affected (0.778 sec)
       
      MariaDB [dbt3]> create table t1 (i int) engine=columnstore;
      Query OK, 0 rows affected (0.667 sec)
       
      MariaDB [dbt3]> create table t2 (i int) engine=columnstore;
      Query OK, 0 rows affected (0.640 sec)
       
      -- case A
       
      MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j = (select min(j) from t2 where t1.i=i);
      ERROR 1815 (HY000): Internal error: IDB-3012: Scalar filter and semi join are not from the same pair of tables.
       
      case B
       
      MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j in (select j from t2 where t1.i=i);
      ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported.
      

      Fundamentally, this would require an internal rewrite procedure usually called de-correlation:

       
      -- case A
       
      MariaDB [dbt3]> select count(*) from t1, t2,(select min(j) j, i from t2 group by i) t3  where t1.i=t2.i and t2.i = t3.i and t2.j=t3.j;
      +----------+
      | count(*) |
      +----------+
      |        0 |
      +----------+
      1 row in set (0.052 sec)
       
      -- case B
       
      MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and exists (select i,j from t2 where t2.i = i and t2.j=j) ;
      +----------+
      | count(*) |
      +----------+
      |        0 |
      +----------+
      1 row in set (0.068 sec)
      
      

      Limited Case - TPC-H Query 19

      MariaDB [dbt3]> select count(*) from t1, t2 where (t1.i=t2.i and t2.j=1) or (t1.i=t2.i and t2.j=2);
      ERROR 1815 (HY000): Internal error: IDB-1000: 't1' and 't2' are not joined.
      

      The pattern here is somewhat specific: a bunch of OR clauses with the join predicate(s) repeated in each. Rewriter should pull common predicates out...

      MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and (t2.j=1 or  t2.j=2);
      +----------+
      | count(*) |
      +----------+
      |        0 |
      +----------+
      1 row in set (0.017 sec)
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              drrtuy Roman
              Reporter:
              gdorman Gregory Dorman
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.