Details
-
New Feature
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Optimizing for distributed engines
Assuming engine L1 (local table) and D1 (distributed table) optimizer
decided that we will use L1 before D1 and one can use a key to access
D1 from L1.
The optimizer will use the following algorithm:
- Read one row from L1
- Ask engine to read all matching rows based on key access from D1
- Loop over all the returned rows from D1
- Check if the row combination L1:D1 matches the where clause
- If yes, send the row combination forwards (to join with next
table, send it the the end user etc.
This means that for every row in L1 there will be a remote access to D1,
which can be very slow.
Another problem is that pushdown conditions to remote engines cannot
include fields from L1 or any other previous table as the values
in the pushdown changes for every row combination.
The task is to create a new read-by-key interface that reduced the number
of remote access and also optimizes pushdown conditions for remote tables
Note that if D1 would be the first table, this performance problem does
not exists. In this case the engine would scan D1 and the engine would
stream the values to the server and do local key lookups in L1 (which
is fast).
---------------------
Suggested solution:
When the SQL layer notices that the next table is a distributed table
it will push all previous row combination into a buffer (like we do
with hash join). When buffer is full it will do one request for
all the buffered row combinations to D1.
New handler interfaces:
/*
|
Prepare a multi-read by key for a distributed engine
|
|
|
index Index number
|
index_parts How may index parts are used
|
param Parameters used by the pushdown condition,
|
in left-to-right order
|
params Number of params in the pushdown condition
|
*/
|
|
|
int handler::index_multi_key_read_with_pushdown_prepare( |
uint index, uint index_parts, Field *param, uint params,
|
uint64 record_length, Item *pushdown);
|
/*
|
Start a read of of multiple rows
|
|
|
row_data pointer to record with key & params
|
rows number of rows in row_data
|
|
|
Row data is record (like for other handler operations) consisting
|
of data for 'index_parts' key field (exactly like for a index_read())
|
followed by data for each param. The layout is exactly as for
|
a normal record, except that null map is after the first key.
|
The 'param' will initially point to the first record in row_data.
|
*/
|
int handler::index_multi_key_read_params(uchar **row_data, uint32 rows); |
/*
|
Read a row from the result of index_mult_key_read()
|
|
|
record Record for read data
|
offset offset (starting from 0) for the first matching row in row_data
|
*/
|
|
|
int handler::index_multi_key_read(uchar *record, uint32 *offset); |
The call sequence for the handle will be:
- index_multi_key_read_with_pushdown_prepare()
- until no more data in L1
- index_multi_key_read_params()
- call index_multi_key_read() until HA_ERR_END_OF_FILE
- index_end()
Pushdown condition and params:
- The pushdown condition will contain constants, function calls,
parameters (pointers to elements in the param list) and fields in D1. - One can move the param fields to other rows in row_data with
the field::move_field_offset().
Notes:
- The SQL layer will sort row data in key_parts + param data order.
- The SQL layer will remove all duplicates of key + param combinations
before sending to the engine. - The above will reduce calls for example if there is a lot of
request to the same row in D1.
As the pushdown can contain things like BETWEEN, any returned row from
D1 may match several rows in L1. The SQL layer has to check all returned
rows from D1 with the corresponding row L1 with the same key combination.
This is why having the 'offset' in index_multi_key_read() would be useful
as it speeds up finding from where to start the check.
The SQL Layer should also check the WHERE clause for all row combinations
as the engine may have ignored checking some functions it does not support.