[MDEV-25080] Allow pushdown of queries involving UNIONs in outer select to foreign engines Created: 2021-03-08 Updated: 2023-11-23 Resolved: 2023-06-15 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Server |
| Fix Version/s: | 11.1.1 |
| Type: | Task | Priority: | Critical |
| Reporter: | Gagan Goel (Inactive) | Assignee: | Oleg Smirnov |
| Resolution: | Fixed | Votes: | 4 |
| Labels: | Preview_10.11, smart_engine | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
mysql_union() currently does not support select handler execution. This is causing performance degradation of queries in ColumnStore involving a UNION in an outer select. More details on the issue are in Current APIsSelect Handler
...
This eventually calls:
which writes the data to query output. Derived handler
Two pushdown interfacesselect_handler:
derived_handler:
groupby_handler:
Storage enginesFederatedX - the test engineha_federatedx supports select_handler and derived_handler.
or of the derived table:
(This obviously doesn't work in many cases e.g. with views or different table names on the backend. But it's good enough to do testing). ColumnStore - the target engineColumnStore actually walks the parsed query tree (SELECT_LEX[_UNIT] structures) and constructs operations that it will push to the backend. The query format that's used to pass to the backend is *NOT* SQL. SolutionThe idea is: 1. Make select_handler interface accept UNIONs.Instead of accepting a SELECT_LEX (and assuming it is the top-level SELECT), select_handler should accept a SELECT_LEX_UNIT (like derived_handler does). 2. Join the storage-engine facing parts of select_handler and derived_handler.There seems to be no difference between these two classes as far as the Storage Engine is concerned. To make the API smaller, we can join them together. 3. Allow parts of UNION to be handled with a Select Handler.Suppose the query is a UNION where some parts of it can be pushed and some not:
Passing the entire query to the select_handler will fail. It should be very easy (see attached poc patch) then to try pushing individual SELECTs to the select_handler. Code-wise, it should be the same select_handler class which is initialized with a SELECT_LEX (like it is done now) instead of a SELECT_LEX_UNIT. Implementation detailsFinding the storage engine to push intoCurrently, it is done in these functions:
They find the first table that has a select (or derived) handler and push. Note that they do not check if the select (or the derived table) has tables from other engines. Such checks are currently done inside each engine's create_select/create_derived function (if done at all). Adjusting FederatedxTODO EXPLAIN output changesWhat should be printed when the whole top-level UNION is pushed? Explain_select has a special case where it was pushed. Check out Explain_select::print_explain,
Explain_union doesn't have such logic. We should either extend Explain_union to have it, or produce an Explain_select object instead of Explain_union. |
| Comments |
| Comment by Sergei Golubchik [ 2022-03-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
So, the plan is
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-08-05 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
During yesterday call with drrtuy we agreed to modify the ColumnStore select handler so it accepts SELECT_LEX_UNIT* instead of SELECT_LEX*:
to
This single select handler must process all possible cases: unions, single selects, derived tables. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2022-08-11 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
A note: Spider supports group_by_handler, but not select_handler or derived_handler. So, implementation of this MDEV as we currently see it will not affect Spider. (Although one might wonder whether group_by_handler has some commonality with derived_handler or select_handler that could be factored out in the future). | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roman [ 2022-08-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
psergei GBH is very different to what SH and DH are. It has a certain interface sending down the projected columns list for example. I don't see there is a way to somehow combine these three. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2022-08-24 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
just an observation: pushdown of union members seems easy to do: psergey-pushdown-union-members.diff | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
We need to decide which output of EXPLAIN FORMAT=JSON we would like to see. What about something like this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-09-05 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Branch bb-10.11- | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2022-09-06 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Oleg, First piece of review input: Looking at this function
I see that it checks all the tables in the query. That is, if one tries to run
no pushdown is done. Please fix this. I would suggest to create a recursive function that walks local tables and calls itself for child subqueries, as we've discussed on the previous calls. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2022-09-06 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
More review input: https://lists.launchpad.net/maria-developers/msg13214.html | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2022-09-06 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Looking at joining derived_handler and select_handler together... print_error implementations look odd.
Overloads:
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2022-09-06 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
ther difference ins prepare() method: select_handler defines a virtual prepare():
ColumnStore overloads it in a dumb way:
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-09-08 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've force-pushed a new revision, please find it in bb-10.11- 1. What is this change for? If it's just the code cleanup and we HAVE to do it (I >>> No, it's not a code cleanup. There is
against
. The select_handler::result class member is introduced to support both SELECT_LEX and SELECT_LEX_UNIT. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2022-09-08 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Note: print_error functions don't have any test coverage:
This doesn't fire ever. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roman [ 2022-09-12 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
There will be two reviewers from our side, namely tntnatbry and me. AFAIK Gagan will start looking into the updated patch today once again. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Gagan Goel (Inactive) [ 2022-09-15 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi oleg.smirnov, looks like adding paranthesis to the queries is not resulting in the pushdown of the whole union: Q1:
Q2:
If the paranthesis are removed, the whole query is pushed down: Q3 (Q1 without paranthesis):
Q4 (Q2 without paranthesis):
Here, t1 and t2 are federatedx tables (taken from the federatedx_create_handlers.test MTR test case in preview-10.11-mdev-25080-union-pushdown server branch):
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-09-15 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Good points, tntnatbry. I have pushed a quick fix into bb-10.11-mdev-25080-fixes branch, hope this helps. psergei, please review when possible. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Gagan Goel (Inactive) [ 2022-09-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Pushed down UNION on two tables containing two different INT types with different signedness has an incorrect output. In the below, output of Q2 is incorrect. Output of Q1 is correct, however. I have not tested other combinations of numeric types but those could also be in error.
Q1:
Q2:
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-09-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
tntnatbry, I think the problem is the table `t2` is connected to remote table `t1`:
As far as I understand, the FederatedX engine may not be able to process such cases when local and remote tables have different field types. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Gagan Goel (Inactive) [ 2022-09-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi oleg.smirnov, Sorry for the typo earlier. I fixed the table DDL for t2 in the comment above and reran the test case. I still get the output as above. So this is indeed a bug. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-09-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
I agree, the bug really exists. I believe it's not related to the UNION pushdown improvement but to the FederatedX engine. I observed even a more strange result with disabled pushdown option:
I'll check this more thoroughly but I believe this shouldn't be an obstacle to test the integration with ColumnStore. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Gagan Goel (Inactive) [ 2022-09-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
oleg.smirnov, unfortunately ColumnStore has the same problem. This is a server-side bug, not specific to a storage engine. The temporary table that the server creates for this query to store the results in table->record[0] has incorrect Field attributes. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-09-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you, tntnatbry, this is indeed a bug. The good news it was relatively easy to fix but also there is a case that I haven't fixed yet:
Partial pushdown of queries including "SELECT <const-value>" produces wrong results. Will keep you updated on the progress. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2022-09-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
There's a memory leak, visible in asan builds in the preview branch. For example here: https://buildbot.askmonty.org/buildbot/builders/kvm-asan/builds/9060/steps/mtr_nm/logs/mysqld.1.err.2 (search for the work "leak") | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-09-24 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
serg, after some examination I think this bug is not related to the improvements made for the UNION pushdown. I've filed a separate | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2022-09-27 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
tntnatbry, I believe I've addressed the bug you discovered. The fix is pushed to bb-10.11-mdev-25080-fix1. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2022-12-28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
According to serg 's message on Slack, it won't happen in 11.0. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Oleg Smirnov [ 2023-06-15 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
Pushed to 11.1 |