mysql_union() currently does not support select handler execution. This is causing performance degradation of queries in ColumnStore involving a UNION in an outer select. More details on the issue are in
This eventually calls:
which writes the data to query output.
- Takes the root SELECT_LEX* as an argument (ha_federated actually assumes that the SELECT is the top-level select).
- This is why UNION is not handled.
- non-virtual function creates the result temp. table.
- then uses init_scan()/next_row()/end_scan() to pump the rows.
- Temporary table is not actually used. We use only temporary table's table->record
as a buffer to get the next row.
- Takes a SELECT_LEX_UNIT* as an argument
- That's why derived tables that are UNIONs can be handled.
- Also has init_scan()/next_row()/end_scan() to pump the rows.
- The table is pumped into a temporary table which is then used to do joins, etc.
- Has an API that's totally different from the above two. So, it is outside of scope of this MDEV.
ha_federatedx supports select_handler and derived_handler.
The query is pushed by getting a text of the query:
or of the derived table:
(This obviously doesn't work in many cases e.g. with views or different table names on the backend. But it's good enough to do testing).
ColumnStore actually walks the parsed query tree (SELECT_LEX[_UNIT] structures) and constructs operations that it will push to the backend.
The query format that's used to pass to the backend is *NOT* SQL.
The idea is:
Instead of accepting a SELECT_LEX (and assuming it is the top-level SELECT), select_handler should accept a SELECT_LEX_UNIT (like derived_handler does).
This way, queries that have UNION (or UNION ALL, INTERSECT, etc) at the top level can be handled with a select_handler.
There seems to be no difference between these two classes as far as the Storage Engine is concerned. To make the API smaller, we can join them together.
Suppose the query is a UNION where some parts of it can be pushed and some not:
Passing the entire query to the select_handler will fail. It should be very easy (see attached poc patch) then to try pushing individual SELECTs to the select_handler.
Code-wise, it should be the same select_handler class which is initialized with a SELECT_LEX (like it is done now) instead of a SELECT_LEX_UNIT.
Currently, it is done in these functions:
They find the first table that has a select (or derived) handler and push.
Note that they do not check if the select (or the derived table) has tables from other engines. Such checks are currently done inside each engine's create_select/create_derived function (if done at all).
What should be printed when the whole top-level UNION is pushed?
Explain_select has a special case where it was pushed. Check out Explain_select::print_explain,
Explain_union doesn't have such logic. We should either extend Explain_union to have it, or produce an Explain_select object instead of Explain_union.