Details
-
New Feature
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
This is a request to improve the operation of so called "smart engines" (so far only ColumnStore and XPAND).
1. Change the order in which handlers are attempted: insead of derived_handlers (DH) going first and select_handler (SH) second, try SH first, and if it cannot be created, proceed to DH.
2. SH should be created using early phases in query compilation - somewhere right after syntax check and name resolution, before any optimizations. SH should be tried on "unmolested" query, and if successful, executed as such.
3. If SH constructor returns NULL, the construction of DHs should proceed by "recursive descent" from the top. As soon as the DH construction succeeds, the handler is declared to be eligible for use and the handler construction proceeds from the sibling of the expression node with the freshly created operation DH.
4. DH construction should begin in as late a phase of optimization as is practical (I suppose creating handlers in different places for each engine is a nonstarter, so the practical choices may well be limited by history). Some rewrites, however, do need to happen prior to testing DH. Below is an initial inventory, in priority order.
high: push predicates to derived tables
like:
join
|
(select *
|
from (select ...
|
from (select ...) q0
|
) q1
|
) q2
|
where
|
x in (x,y,x) and <joint conditions>
|
should become
join
|
(select *
|
from (select ...
|
from (select ...
|
where x in (x,y,x) ) q0
|
) q1
|
) q2
|
where
|
<joint conditions>;
|
medium-high: for members of the UNION of different engines, group the members for each "smart engine", push the applicable predicates, and execute the subset via one roundtrip on one "implied" DH. Do "UNION pruning" as usual.
low (perhaps not necessary) : while creating derived handlers, when encountering a proper table (a member of smart engine family) at the bottom of the tree, convert it to "implied derived table" and engage derived handler instead of letting it go through row access. This is mainly for the economy of engineering, to reduce the number of code paths in engine's codebase.
not in scope for 10.6: use "functional dependencies (transitive closure) when pushing predicates around.
5. This is almost critical requirement, although the implementation approach and architecture need to discussed. At present handlers creation logic is not sustainable.
SH executes the query on create_handler. DH, to the contrary returns a hard-coded SUCCESS, which may or may not be a lie. This makes DH unusable in practice (the incompatibility-cause failure will occur only in execution method, from which there is not return and no fallback to row api).
There are two ways to make it work:
a) MDB server could maintain a certain "engine capability matrix". Elements of that exist today (for example - "can you handle predicates" conf parameter? We can add a structure with answers to four questions: can you do CTE? (ii) Can you do window functions? (iii) ca you handle a function so-and-so? - and can you do cross engine joins yourself (like ColumnStore does). If this is the preferred method, the structure needs to be expandable (there may be more like that).
b) the same logic can be pushed down to the login (every plugin will determine whether SQL in question can or cannot be handled using "soft methods" of examining query syntax (currently, the method is actual attempt to execute). personally, I favor (a) as a more general and architecturally cleaner. The "soft test" can be applied after name resolution but prior to create_handler.
Attachments
Issue Links
- includes
-
MDEV-9062 join pushdown to storage engines
- Open
-
MDEV-16502 Join pushdown to something like an IN filter for different storage engine
- Open
- relates to
-
MDEV-23860 Move all optimization query re-writes from the prepare phase into the optimizer phase
- Closed