mysql_union() currently does not support select handler execution. This is causing performance degradation of queries in ColumnStore involving a UNION in an outer select. More details on the issue are in MCOL-4584.
Takes the root SELECT_LEX* as an argument (ha_federated actually assumes that the SELECT is the top-level select).
This is why UNION is not handled.
non-virtual function creates the result temp. table.
then uses init_scan()/next_row()/end_scan() to pump the rows.
Temporary table is not actually used. We use only temporary table's table->record[0]
as a buffer to get the next row.
derived_handler:
Takes a SELECT_LEX_UNIT* as an argument
That's why derived tables that are UNIONs can be handled.
Also has init_scan()/next_row()/end_scan() to pump the rows.
The table is pumped into a temporary table which is then used to do joins, etc.
groupby_handler:
Has an API that's totally different from the above two. So, it is outside of scope of this MDEV.
Storage engines
FederatedX - the test engine
ha_federatedx supports select_handler and derived_handler.
The query is pushed by getting a text of the query:
int ha_federatedx_select_handler::init_scan() {
...
if ((*iop)->query(thd->query(), thd->query_length()))
goto err;
or of the derived table:
ha_federatedx_derived_handler::init_scan() {
...
if ((*iop)->query(derived->derived_spec.str, derived->derived_spec.length))
goto err;
(This obviously doesn't work in many cases e.g. with views or different table names on the backend. But it's good enough to do testing).
ColumnStore - the target engine
ColumnStore actually walks the parsed query tree (SELECT_LEX[_UNIT] structures) and constructs operations that it will push to the backend.
The query format that's used to pass to the backend is *NOT* SQL.
Solution
The idea is:
1. Make select_handler interface accept UNIONs.
Instead of accepting a SELECT_LEX (and assuming it is the top-level SELECT), select_handler should accept a SELECT_LEX_UNIT (like derived_handler does).
This way, queries that have UNION (or UNION ALL, INTERSECT, etc) at the top level can be handled with a select_handler.
2. Join the storage-engine facing parts of select_handler and derived_handler.
There seems to be no difference between these two classes as far as the Storage Engine is concerned. To make the API smaller, we can join them together.
3. Allow parts of UNION to be handled with a Select Handler.
Suppose the query is a UNION where some parts of it can be pushed and some not:
select * from columnstore_table union select * from innodb_table
Passing the entire query to the select_handler will fail. It should be very easy (see attached poc patch) then to try pushing individual SELECTs to the select_handler.
Code-wise, it should be the same select_handler class which is initialized with a SELECT_LEX (like it is done now) instead of a SELECT_LEX_UNIT.
.
They find the first table that has a select (or derived) handler and push.
Note that they do not check if the select (or the derived table) has tables from other engines. Such checks are currently done inside each engine's create_select/create_derived function (if done at all).
Adjusting Federatedx
TODO
EXPLAIN output changes
What should be printed when the whole top-level UNION is pushed?
Explain_select has a special case where it was pushed. Check out Explain_select::print_explain,
if (select_type == pushed_derived_text || select_type == pushed_select_text)
Sergei Golubchik
added a comment - There's a memory leak, visible in asan builds in the preview branch. For example here: https://buildbot.askmonty.org/buildbot/builders/kvm-asan/builds/9060/steps/mtr_nm/logs/mysqld.1.err.2 (search for the work "leak")
serg, after some examination I think this bug is not related to the improvements made for the UNION pushdown. I've filed a separate MDEV-29624 for this.
Oleg Smirnov
added a comment - serg , after some examination I think this bug is not related to the improvements made for the UNION pushdown. I've filed a separate MDEV-29624 for this.
There's a memory leak, visible in asan builds in the preview branch. For example here: https://buildbot.askmonty.org/buildbot/builders/kvm-asan/builds/9060/steps/mtr_nm/logs/mysqld.1.err.2 (search for the work "leak")