Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
We appear to lack a usual component of database optimization - an ability to internally transform (rewrite) queries into the shape best suited for the specifics of the database engine. Without it, a number of commercially critical query shapes cannot be executed in Columnstore, including 2 from TPC-H (3 queries fail to parse as a result).
Type 1. Certain correlated subqueries.
MariaDB [dbt3]> drop table if exists t1,t2;
|
Query OK, 0 rows affected (0.778 sec)
|
|
MariaDB [dbt3]> create table t1 (i int) engine=columnstore;
|
Query OK, 0 rows affected (0.667 sec)
|
|
MariaDB [dbt3]> create table t2 (i int) engine=columnstore;
|
Query OK, 0 rows affected (0.640 sec)
|
|
-- case A
|
|
MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j = (select min(j) from t2 where t1.i=i);
|
ERROR 1815 (HY000): Internal error: IDB-3012: Scalar filter and semi join are not from the same pair of tables.
|
|
case B
|
|
MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j in (select j from t2 where t1.i=i);
|
ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported.
|
Fundamentally, this would require an internal rewrite procedure usually called de-correlation:
|
-- case A
|
|
MariaDB [dbt3]> select count(*) from t1, t2,(select min(j) j, i from t2 group by i) t3 where t1.i=t2.i and t2.i = t3.i and t2.j=t3.j;
|
+----------+
|
| count(*) |
|
+----------+
|
| 0 |
|
+----------+
|
1 row in set (0.052 sec)
|
|
-- case B
|
|
MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and exists (select i,j from t2 where t2.i = i and t2.j=j) ;
|
+----------+
|
| count(*) |
|
+----------+
|
| 0 |
|
+----------+
|
1 row in set (0.068 sec)
|
|
Limited Case - TPC-H Query 19
MariaDB [dbt3]> select count(*) from t1, t2 where (t1.i=t2.i and t2.j=1) or (t1.i=t2.i and t2.j=2);
|
ERROR 1815 (HY000): Internal error: IDB-1000: 't1' and 't2' are not joined.
|
The pattern here is somewhat specific: a bunch of OR clauses with the join predicate(s) repeated in each. Rewriter should pull common predicates out...
CREATE TABLE a (
|
id integer,
|
x integer,
|
p integer
|
);
|
|
CREATE TABLE b (
|
id integer,
|
x integer,
|
q integer
|
);
|
|
INSERT INTO a (x, p) VALUES
|
(1, 1),
|
(1, 1),
|
(2, 1);
|
|
INSERT INTO b (x, q) VALUES
|
(1, 3),
|
(2, 3);
|
|
SELECT a.x, a.p, b.q FROM a, b WHERE (a.x = b.x and a.p = 1) OR (a.x = b.x and b.q = 3);
|
|
Attachments
Activity
Rank | Ranked higher |
Rank | Ranked higher |
Fix Version/s | 6.4.1 [ 26046 ] |
Fix Version/s | 6.3.1 [ 25801 ] |
Assignee | Denis Khalikov [ JIRAUSER48434 ] | Roman [ drrtuy ] |
Fix Version/s | 22.08 [ 26904 ] | |
Fix Version/s | 6.4.1 [ 26046 ] |
Fix Version/s | 23.02 [ 28209 ] | |
Fix Version/s | 22.08 [ 26904 ] |
Description |
We appear to lack a usual component of database optimization - an ability to internally transform (rewrite) queries into the shape best suited for the specifics of the database engine. Without it, a number of commercially critical query shapes cannot be executed in Columnstore, including 2 from TPC-H (3 queries fail to parse as a result).
*Type 1. Certain correlated subqueries.* {code} MariaDB [dbt3]> drop table if exists t1,t2; Query OK, 0 rows affected (0.778 sec) MariaDB [dbt3]> create table t1 (i int) engine=columnstore; Query OK, 0 rows affected (0.667 sec) MariaDB [dbt3]> create table t2 (i int) engine=columnstore; Query OK, 0 rows affected (0.640 sec) -- case A MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j = (select min(j) from t2 where t1.i=i); ERROR 1815 (HY000): Internal error: IDB-3012: Scalar filter and semi join are not from the same pair of tables. case B MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j in (select j from t2 where t1.i=i); ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported. {code} Fundamentally, this would require an internal rewrite procedure usually called *de-correlation*: {code} -- case A MariaDB [dbt3]> select count(*) from t1, t2,(select min(j) j, i from t2 group by i) t3 where t1.i=t2.i and t2.i = t3.i and t2.j=t3.j; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.052 sec) -- case B MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and exists (select i,j from t2 where t2.i = i and t2.j=j) ; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.068 sec) {code} *Limited Case - TPC-H Query 19* {code} MariaDB [dbt3]> select count(*) from t1, t2 where (t1.i=t2.i and t2.j=1) or (t1.i=t2.i and t2.j=2); ERROR 1815 (HY000): Internal error: IDB-1000: 't1' and 't2' are not joined. {code} The pattern here is somewhat specific: a bunch of OR clauses with the join predicate(s) repeated in each. Rewriter should pull common predicates out... {code} MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and (t2.j=1 or t2.j=2); +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.017 sec) {code} |
We appear to lack a usual component of database optimization - an ability to internally transform (rewrite) queries into the shape best suited for the specifics of the database engine. Without it, a number of commercially critical query shapes cannot be executed in Columnstore, including 2 from TPC-H (3 queries fail to parse as a result).
*Type 1. Certain correlated subqueries.* {code} MariaDB [dbt3]> drop table if exists t1,t2; Query OK, 0 rows affected (0.778 sec) MariaDB [dbt3]> create table t1 (i int) engine=columnstore; Query OK, 0 rows affected (0.667 sec) MariaDB [dbt3]> create table t2 (i int) engine=columnstore; Query OK, 0 rows affected (0.640 sec) -- case A MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j = (select min(j) from t2 where t1.i=i); ERROR 1815 (HY000): Internal error: IDB-3012: Scalar filter and semi join are not from the same pair of tables. case B MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j in (select j from t2 where t1.i=i); ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported. {code} Fundamentally, this would require an internal rewrite procedure usually called *de-correlation*: {code} -- case A MariaDB [dbt3]> select count(*) from t1, t2,(select min(j) j, i from t2 group by i) t3 where t1.i=t2.i and t2.i = t3.i and t2.j=t3.j; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.052 sec) -- case B MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and exists (select i,j from t2 where t2.i = i and t2.j=j) ; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.068 sec) {code} *Limited Case - TPC-H Query 19* {code} MariaDB [dbt3]> select count(*) from t1, t2 where (t1.i=t2.i and t2.j=1) or (t1.i=t2.i and t2.j=2); ERROR 1815 (HY000): Internal error: IDB-1000: 't1' and 't2' are not joined. {code} The pattern here is somewhat specific: a bunch of OR clauses with the join predicate(s) repeated in each. Rewriter should pull common predicates out... {code} CREATE TABLE a ( id integer, x integer, p integer ); CREATE TABLE b ( id integer, x integer, q integer ); INSERT INTO a (x, p) VALUES (1, 1), (1, 1), (2, 1); INSERT INTO b (x, q) VALUES (1, 3), (2, 3); SELECT x, a.p, b.q FROM a JOIN b USING (x) WHERE a.p = 1 OR b.q = 3; {code} |
Description |
We appear to lack a usual component of database optimization - an ability to internally transform (rewrite) queries into the shape best suited for the specifics of the database engine. Without it, a number of commercially critical query shapes cannot be executed in Columnstore, including 2 from TPC-H (3 queries fail to parse as a result).
*Type 1. Certain correlated subqueries.* {code} MariaDB [dbt3]> drop table if exists t1,t2; Query OK, 0 rows affected (0.778 sec) MariaDB [dbt3]> create table t1 (i int) engine=columnstore; Query OK, 0 rows affected (0.667 sec) MariaDB [dbt3]> create table t2 (i int) engine=columnstore; Query OK, 0 rows affected (0.640 sec) -- case A MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j = (select min(j) from t2 where t1.i=i); ERROR 1815 (HY000): Internal error: IDB-3012: Scalar filter and semi join are not from the same pair of tables. case B MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j in (select j from t2 where t1.i=i); ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported. {code} Fundamentally, this would require an internal rewrite procedure usually called *de-correlation*: {code} -- case A MariaDB [dbt3]> select count(*) from t1, t2,(select min(j) j, i from t2 group by i) t3 where t1.i=t2.i and t2.i = t3.i and t2.j=t3.j; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.052 sec) -- case B MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and exists (select i,j from t2 where t2.i = i and t2.j=j) ; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.068 sec) {code} *Limited Case - TPC-H Query 19* {code} MariaDB [dbt3]> select count(*) from t1, t2 where (t1.i=t2.i and t2.j=1) or (t1.i=t2.i and t2.j=2); ERROR 1815 (HY000): Internal error: IDB-1000: 't1' and 't2' are not joined. {code} The pattern here is somewhat specific: a bunch of OR clauses with the join predicate(s) repeated in each. Rewriter should pull common predicates out... {code} CREATE TABLE a ( id integer, x integer, p integer ); CREATE TABLE b ( id integer, x integer, q integer ); INSERT INTO a (x, p) VALUES (1, 1), (1, 1), (2, 1); INSERT INTO b (x, q) VALUES (1, 3), (2, 3); SELECT x, a.p, b.q FROM a JOIN b USING (x) WHERE a.p = 1 OR b.q = 3; {code} |
We appear to lack a usual component of database optimization - an ability to internally transform (rewrite) queries into the shape best suited for the specifics of the database engine. Without it, a number of commercially critical query shapes cannot be executed in Columnstore, including 2 from TPC-H (3 queries fail to parse as a result).
*Type 1. Certain correlated subqueries.* {code} MariaDB [dbt3]> drop table if exists t1,t2; Query OK, 0 rows affected (0.778 sec) MariaDB [dbt3]> create table t1 (i int) engine=columnstore; Query OK, 0 rows affected (0.667 sec) MariaDB [dbt3]> create table t2 (i int) engine=columnstore; Query OK, 0 rows affected (0.640 sec) -- case A MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j = (select min(j) from t2 where t1.i=i); ERROR 1815 (HY000): Internal error: IDB-3012: Scalar filter and semi join are not from the same pair of tables. case B MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and t2.j in (select j from t2 where t1.i=i); ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported. {code} Fundamentally, this would require an internal rewrite procedure usually called *de-correlation*: {code} -- case A MariaDB [dbt3]> select count(*) from t1, t2,(select min(j) j, i from t2 group by i) t3 where t1.i=t2.i and t2.i = t3.i and t2.j=t3.j; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.052 sec) -- case B MariaDB [dbt3]> select count(*) from t1, t2 where t1.i=t2.i and exists (select i,j from t2 where t2.i = i and t2.j=j) ; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.068 sec) {code} *Limited Case - TPC-H Query 19* {code} MariaDB [dbt3]> select count(*) from t1, t2 where (t1.i=t2.i and t2.j=1) or (t1.i=t2.i and t2.j=2); ERROR 1815 (HY000): Internal error: IDB-1000: 't1' and 't2' are not joined. {code} The pattern here is somewhat specific: a bunch of OR clauses with the join predicate(s) repeated in each. Rewriter should pull common predicates out... {code} CREATE TABLE a ( id integer, x integer, p integer ); CREATE TABLE b ( id integer, x integer, q integer ); INSERT INTO a (x, p) VALUES (1, 1), (1, 1), (2, 1); INSERT INTO b (x, q) VALUES (1, 3), (2, 3); SELECT a.x, a.p, b.q FROM a, b WHERE (a.x = b.x and a.p = 1) OR (a.x = b.x and b.q = 3); {code} |
Fix Version/s | 23.08 [ 28540 ] | |
Fix Version/s | 23.02 [ 28209 ] |
Issue Type | New Feature [ 2 ] | Task [ 3 ] |
Rank | Ranked higher |
Rank | Ranked lower |