Queries with UNION ALL perform disproportionally badly (MCOL-4569)

[MCOL-4589] Optimize out columns in a subquery involving a UNION which are not referenced in the outer select Created: 2021-03-08  Updated: 2021-04-20  Resolved: 2021-04-20

Status: Closed
Project: MariaDB ColumnStore
Component/s: MDB Plugin
Affects Version/s: None
Fix Version/s: 5.6.1

Type: Sub-Task Priority: Major
Reporter: Gagan Goel (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Sprint: 2021-4

 Description   

This task is similar to MCOL-4543, but for subqueries involving a UNION. I.e., If query Q1 is of the form:

SELECT count(c2) FROM (SELECT * FROM t1 UNION ALL SELECT * FROM t1)q;

Assuming t1 here contains 10 columns c1, c2, ... , c10. We build an ineffective RowGroup in ExeMgr of the form (1, c2_value1, 1, 1, 1, 1, 1, 1, 1, 1). The objective here is to remove all non-referenced columns from the end, until the first referenced column is encountered, i.e. trim down the RowGroup to (1, c2_value1).



 Comments   
Comment by Daniel Lee (Inactive) [ 2021-04-20 ]

Build verified: 5.6.1 ( Drone #2207 )

Tested on a 1gb dbt3 database

5.5.2-1

MariaDB [tpch1]> SELECT count(l_orderkey) FROM (SELECT * FROM lineitem UNION ALL SELECT * FROM lineitem) q;
+-------------------+
| count(l_orderkey) |
+-------------------+
|          12002430 |
+-------------------+
1 row in set (7.106 sec)

5.6.1

MariaDB [tpch1]> SELECT count(l_orderkey) FROM (SELECT * FROM lineitem UNION ALL SELECT * FROM lineitem) q;
+-------------------+
| count(l_orderkey) |
+-------------------+
|          12002430 |
+-------------------+
1 row in set (1.731 sec)

Generated at Thu Feb 08 02:51:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.