Details
-
New Feature
-
Status: Stalled (View Workflow)
-
Major
-
Resolution: Unresolved
Description
A long standing (and informally known) issue:
Join optimizer makes its choices [almost] without regard for ORDER BY ... LIMIT clause. ORDER BY ... LIMIT optimizer is invoked when the join order is already fixed. If the picked join order doesn't allow to resolve ORDER BY ... LIMIT efficiently... then we end up with a very poor query plan.
Example:
select * from
|
t_fact
|
join dim1 on t_fact.dim1_id= dim1.dim1_id
|
join dim2 on t_fact.dim2_id= dim2.dim2_id
|
order by
|
t_fact.col1
|
limit 1000;
|
+------+-------------+--------+--------+-----------------+---------+---------+-------------------+------+---------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+--------+--------+-----------------+---------+---------+-------------------+------+---------------------------------+
|
| 1 | SIMPLE | dim1 | ALL | PRIMARY | NULL | NULL | NULL | 500 | Using temporary; Using filesort |
|
| 1 | SIMPLE | t_fact | ref | dim1_id,dim2_id | dim1_id | 4 | j3.dim1.dim1_id | 1 | |
|
| 1 | SIMPLE | dim2 | eq_ref | PRIMARY | PRIMARY | 4 | j3.t_fact.dim2_id | 1 | |
|
+------+-------------+--------+--------+-----------------+---------+---------+-------------------+------+---------------------------------+
|
This uses filesort and takes ~8 sec.
Now, let's force the right join order:
select * from
|
t_fact
|
straight_join dim1 on t_fact.dim1_id= dim1.dim1_id
|
straight_join dim2 on t_fact.dim2_id= dim2.dim2_id
|
order by
|
t_fact.col1
|
limit 1000;
|
+------+-------------+--------+--------+-----------------+---------+---------+-------------------+------+-------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+--------+--------+-----------------+---------+---------+-------------------+------+-------+
|
| 1 | SIMPLE | t_fact | index | dim1_id,dim2_id | col1 | 4 | NULL | 1000 | |
|
| 1 | SIMPLE | dim1 | eq_ref | PRIMARY | PRIMARY | 4 | j3.t_fact.dim1_id | 1 | |
|
| 1 | SIMPLE | dim2 | eq_ref | PRIMARY | PRIMARY | 4 | j3.t_fact.dim2_id | 1 | |
|
+------+-------------+--------+--------+-----------------+---------+---------+-------------------+------+-------+
|
This uses index to resolve the ORDER BY ... LIMIT and the select takes 0.01 sec to execute.
Dataset:
create table ten(a int);
|
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
|
|
create table one_k(a int);
|
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
|
|
create table t_fact
|
(
|
fact_id int not null,
|
dim1_id int not null,
|
dim2_id int not null,
|
col1 int not null,
|
primary key(fact_id),
|
key(dim1_id),
|
key(dim2_id),
|
key(col1)
|
);
|
|
insert into t_fact
|
select
|
A.a+1000*B.a+1000*1000*C.a,
|
A.a,
|
B.a,
|
A.a+1000*B.a+1000*1000*C.a
|
from
|
one_k A ,
|
one_k B,
|
ten C
|
where
|
A.a<500 and B.a<500
|
;
|
|
|
create table dim1
|
(
|
dim1_id int not null primary key,
|
col1 int
|
);
|
|
insert into dim1
|
select a,a from one_k where a<500;
|
|
create table dim2
|
(
|
dim2_id int not null primary key,
|
col1 int
|
);
|
insert into dim2
|
select a,a from one_k where a<500;
|
Attachments
Issue Links
- blocks
-
MDEV-33412 cost-based optimizer choice for k-NN indexes
- Open
- includes
-
MDEV-8002 Ability to use index for order-by and covering index changes with type of join to small table
- Open
-
MDEV-21713 LIMIT optimization and selectivity: pessimistic estimates cause optimistic plans
- Open
-
MDEV-22360 Sufficient conditions for accurate calculation of join cardinality
- Stalled
- is blocked by
-
MDEV-21643 test correctness of cost_based_order_by_limit
- Stalled
- is duplicated by
-
MDEV-6813 ORDER BY limit optimizer doesn't take condition selectivity into account
- Closed
-
MDEV-14569 Query runs slower using a UNIQUE KEY vs. a KEY
- Closed
- relates to
-
MDEV-18079 Opportunistic optimization for ORDER BY LIMIT N queries
- Open
-
MDEV-35280 Performance Issue on TPC-H Query 2
- Open
-
MDEV-8880 Very suboptimal join order is generated for a simple query even if MariaDB query planner knows the other is better in every sense
- Stalled
-
MDEV-13275 Query optimizer not picking optimal ORDER BY PRIMARY in case of INNER JOIN on PRIMARY like it does for similar WHERE condition
- Stalled
-
MDEV-13694 Wrong result upon GROUP BY with orderby_uses_equalities=on
- Closed
-
MDEV-14621 Query very slow compared to Mysql
- Closed
-
MDEV-16214 Incorrect plan taken by the optimizer , uses INDEX instead of ref access with ORDER BY
- Closed
-
MDEV-18094 Query with order by limit picking index scan over filesort
- Closed
-
MDEV-19808 Add Optimizer Switch for Filesort with Small LIMIT Optimization
- Open
-
MDEV-20129 Equality propagation for ORDER BY items do not work with expressions
- Stalled
-
MDEV-20209 Using different index with same range gives different number of records
- Closed
-
MDEV-20459 Selectivity of equality condition in ref access not discounted if range access on same index involved a non-equality condition
- Stalled
-
MDEV-21408 Performances testing for ORDER BY with LIMIT optimization
- Open
-
MDEV-35246 Vector search skips a row in the table
- Closed
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...