[MCOL-2266] Make ColumnStore "flatten" ("merge") subqueries when appropriate Created: 2019-04-02 Updated: 2023-10-25 Resolved: 2023-10-25 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | PrimProc |
| Affects Version/s: | 1.2.3 |
| Fix Version/s: | Icebox |
| Type: | New Feature | Priority: | Critical |
| Reporter: | Adrian Zhelev (Inactive) | Assignee: | Gagan Goel (Inactive) |
| Resolution: | Won't Do | Votes: | 1 |
| Labels: | performance | ||
| Epic Link: | ColumnStore Performance Improvements |
| Sprint: | 2021-3 |
| Description |
|
Queries like SELECT count(a) FROM (SELECT a,b,c,d ..., z FROM table) t are presently insufficiently optimized in columnstore, the way they are optimized in every RDBMS. Normally, the execution plan for a query like that should be identical to SELECT count(a) FROM table. But it isn't. For example: select count(year) from (select * from flights) q; All rows from all nodes are passed to the PM1 and the entire aggregation is conducted by ExeMgr on it. By contrast: MariaDB [bts]> select count(year) from flights; Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows Nodes aggregate locally and in parallel, and send partial aggregated rows of which there are a lot less to the primary which completes the aggregation. The difference in response time is obviously very significant (17x in the two queries above). |
| Comments |
| Comment by Gregory Dorman (Inactive) [ 2021-02-18 ] |
|
The problem as reported appears corrected in 5.5.1 (maybe earlier). The ticket target is changed into the cases where it is still problematic. MariaDB [bts]> select city, state from (select city, state, country from airports) q; This is a correct plan, and right number of blocks touched. |
| Comment by David Hall (Inactive) [ 2022-03-04 ] |
|
This is an optimizer issue. We are working on an optimizer design. When a Jira is created for it, this JIRA may be linked or removed. |