Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
We appear to lack a usual component of database optimization - an ability to internally transform (rewrite) queries into the shape best suited for the specifics of the database engine. Without it, a number of commercially critical query shapes cannot be executed in Columnstore, including 2 from TPC-H (3 queries fail to parse as a result).
Type 1. Certain correlated subqueries.
MariaDB [dbt3]> drop table if exists t1,t2;
|
Query OK, 0 rows affected (0.778 sec)
|
|
MariaDB [dbt3]> create table t1 (i int) engine=columnstore;
|
Query OK, 0 rows affected (0.667 sec)
|
|
MariaDB [dbt3]> create table t2 (i int) engine=columnstore;
|
Query OK, 0 rows affected (0.640 sec)
|
|
-- case A
|
|
MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j = (select min(j) from t2 where t1.i=i);
|
ERROR 1815 (HY000): Internal error: IDB-3012: Scalar filter and semi join are not from the same pair of tables.
|
|
case B
|
|
MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j in (select j from t2 where t1.i=i);
|
ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported.
|
Fundamentally, this would require an internal rewrite procedure usually called de-correlation:
|
-- case A
|
|
MariaDB [dbt3]> select count(*) from t1, t2,(select min(j) j, i from t2 group by i) t3 where t1.i=t2.i and t2.i = t3.i and t2.j=t3.j;
|
+----------+
|
| count(*) |
|
+----------+
|
| 0 |
|
+----------+
|
1 row in set (0.052 sec)
|
|
-- case B
|
|
MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and exists (select i,j from t2 where t2.i = i and t2.j=j) ;
|
+----------+
|
| count(*) |
|
+----------+
|
| 0 |
|
+----------+
|
1 row in set (0.068 sec)
|
|
Limited Case - TPC-H Query 19
MariaDB [dbt3]> select count(*) from t1, t2 where (t1.i=t2.i and t2.j=1) or (t1.i=t2.i and t2.j=2);
|
ERROR 1815 (HY000): Internal error: IDB-1000: 't1' and 't2' are not joined.
|
The pattern here is somewhat specific: a bunch of OR clauses with the join predicate(s) repeated in each. Rewriter should pull common predicates out...
CREATE TABLE a (
|
id integer,
|
x integer,
|
p integer
|
);
|
|
CREATE TABLE b (
|
id integer,
|
x integer,
|
q integer
|
);
|
|
INSERT INTO a (x, p) VALUES
|
(1, 1),
|
(1, 1),
|
(2, 1);
|
|
INSERT INTO b (x, q) VALUES
|
(1, 3),
|
(2, 3);
|
|
SELECT a.x, a.p, b.q FROM a, b WHERE (a.x = b.x and a.p = 1) OR (a.x = b.x and b.q = 3);
|
|