Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
Description
For a given join query Q over tables T1,...,Tn a PK filter built for Ti is a collection of primary keys of Ti F such that the projection of the result set of Q on table Ti is a subset of all records from Ti whose primary keys are contained in F
If the cardinality of filter F is small enough in comparison with the cardinality of Ti and F can be easily built then using the filter to execute Q may be quite beneficial.
Consider this query for the dbt3_s001 database
(see mysql-test/include/dbt3_s001.inc).
SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey
|
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-02-01'
|
AND
|
o_totalprice > 200000;
|
Assume that we have indexes on lineitem(l_shipdate) and
orders(o_totalprice) : i_l_shipdate and i_o_totalprice.
total number of orders: 1500,
number of orders where o_totalprice > 200000: 87
selectivity of the predicate: 87/1500=0.058
total number of lineitems: 6005
number of lineites where l_shipdate BETWEEN '1997-01-01' AND '1997-02-01': 101
selectivity of the predicate: 101/6005=0.0168
There are two possible join orders to execute this query:
1. orders -> lineitem
2. lineitem ->orders
1. orders -> lineitem
---------------------
How it can be executed in InnoDB
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1.1
using range scan by the index i_l_shipdate collect PK for the rows
from this range into a container (e.g. ordered array) F1
1.2
use range scan by i_o_totalprice.
for each key from this range take PK for this key pk_orders_i
1.3
using foreign key i_l_orderkey scan keys with l_orderkey=pk_orders_i
for each such key take pk_lineitem_j and check it against F1:
if in: consider pk_orders, pk_lineitem_i as possible match
1.4
for each matched pair pk_orders_i, pk_lineitem_j
access rows of orders and lineitem
Cost:
1.1. 1 random i/o in index i_l_shipdate + creation of F1 (in memory) +
1.2. 1 random i/o in index i_o_totalprice +
1.3. 87 random i/o in index i_l_orderkey + 87 * 4 lookups in F1 +
1.4. 13 random i/o by pk in orders +
13 random i/o by pk in lineitem +
Cost of the current execution with this join order:
1 random i/o in index i_o_totalprice +
87 random i/o by pk in orders +
87 random i/o in index i_l_orderkey +
87 * 4 random i/o by pk in lineitem
Loss:
1 random i/o in index i_l_shipdate + creation of F1 (in memory) +
87 * 4 lookups in F1
vs
Gain:
(87-13)=74 random i/o by pk in orders +
(87*4-13)=325 random i/o by pk in lineitem
How it can be executed in MyISAM
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1.1
using range scan by the index i_l_shipdate collect for the rows
from this range into a container (e.g. ordered array) F1
1.2
use range scan by i_o_totalprice.
for each key from this range access orders, get row orders_i
and take o_orderkey_i from it
1.3
using foreign key i_l_orderkey scan keys with l_orderkey=o_orderkr_i
for each such key take RID_lineitem_j and check it against F1:
if in: consider o_orders_i, RID_lineitem_j as possible match
1.4 for each matched pair o_orders_i, RID_lineitem_j
access rows of lineitem
Cost:
1.1. 1 random i/o in index i_l_shipdate + creation of F1 (in memory) +
1.2. 1 random i_o in index i_o_totalprice +
87 random i/o in orders
1.3. 87 random i/o in index i_l_orderkey + 87 * 4 lookups in F1 +
1.4. 13 random i/o by pk in lineitem
2. lineitem -> orders
How it can be executed in InnoDB
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2.1
using range scan by the index i_o_totalprice collect PK for the rows
from this range into a container (e.g. ordered array) F2
2.2
use range scan by i_l_shipdate.
for each key from this range take PK for this key pk_lineitem_i
take row lineitem_i and take l_orderkey_i from this row
2.3
access row lineitem_i by pk_lineitem_i and take l_orderkey_i from this row
check l_orderkey_i against F2
if in: consider lineitem_i, l_orderkey_i as possible match
2.4
for each matched pair lineitem_i, l_orderkey_i
access row of orders
Cost:
2.1. 1 random i/o in index i_o_totalprice + creation of F2 (in memory) +
2.2. 1 random i_o in index i_l_shipdate +
2.3. 101 random i/o by pk in lineitem + 101 lookups in F2
2.4. 13 random i/o by pk in orders
Cost of the current execution with this join order:
1 random i/o in index i_o_shipdate +
101 random i/o by pk in lineitem +
101 random i/o by pk in orders
Loss:
1 random i/o in index i_l_shipdate + creation of F1 (in memory) +
101 lookups in F2
vs
Gain:
(101-13)=88 random i/o by pk in orders
How it can be executed in MyISAM
--------------------------------
2.1
using range scan by the index i_o_totalprice collect RID for the rows
from this range into a container (e.g. ordered array) F2
2.2
use range scan by i_l_shipdate.
for each key from this range take RID for this key rid_lineitem_i
2.3
access row lineitem_i by rid_lineitem_i and take l_orderkey_i from this row
2.4
look up into the primary index i_o_orderkey by l_orderkey_i
get rid_orders_i and check rid_orders_i against F2
if in: consider lineitem_i, rid_orders_i as possible match
2.5
for each matched pair lineitem_i, rid_orderkey_i
access row of orders
Cost:
2.1. 1 random i/o in index i_o_totalprice + creation of F2 (in memory) +
2.2. 1 random i_o in index i_l_shipdate +
2.3. 101 random i/o by pk in lineitem +
2.4. 101 random lookups in the pk index i_o_orderkey +
2.5. 13 random i/o by pk in orders
Attachments
Issue Links
- causes
-
MDEV-18477 Implement FNV-1a hash function
- Stalled
-
MDEV-20407 mysqld got signal 11; rowid filter
- Closed
-
MDEV-21040 InnoDB: fulltext search with ` in(...)` crashed on specific combination
- Closed
-
MDEV-21191 MariaDB crashes executing SELECT request after upgrading to 10.4.10
- Closed
-
MDEV-21535 Unnecessarily large ha_innobase::records_in_range() scans
- Closed
-
MDEV-23861 Execution plan
- Closed
-
MDEV-28846 Poor performance when rowid filter contains no elements
- Closed
- includes
-
MDEV-18144 ANALYZE for statement support for PK filters
- Closed
- is blocked by
-
MDEV-18332 Tests for PK filters
- Closed
- relates to
-
MDEV-19919 Assertion `!prebuilt->index->is_primary()' failed in row_search_idx_cond_check
- Closed
-
MDEV-20056 Assertion `!prebuilt->index->is_primary()' failed in row_search_idx_cond_check
- Closed
-
MDEV-21794 Optimizer flag rowid_filter leads to long query
- Closed
-
MDEV-23035 INNODB assertion failure -> crash
- Closed
-
MDEV-22761 InnoDB: Assertion failure row0sel.cc line 3966 exception 0x80000003
- Closed
-
MDEV-33522 Mariadb process crashes when running a simple join query
- Closed