There appears to be a significant and logically unexplainable overhead in queries which involve aggregates on top of a subquery with wide projection. Top examples:
On a table of 40 million rows, select count(c) from (select * from t) q; we see roughly double elapsed time (and CPU consumed by ExeMgr on PM1 from which the query is issued) over select count(c) from (select c from t);
Plans and I/O statistics are identical.
++++++++++++++++++++++++++++++++++++++ Below part was split off into a separate ticket 4569
It gets much worse if a UNION ALL is involved.
If you do
select count(c) from (select * from t1) q;
select count(c) from (select * from t2) q;
and compare it to
select count(c) from (select * from t1 union all select * from t2) q; the difference in elapsed and CPU is around 10x (second query is 10x worse than the sum of the first two).
Again, plans and I/O stats are identical (second is the same as sum of the first two).
1. CPU consumed was obtained by running TOP on the PM1 node, and taking TIME+ value before and immediately after the query
2. There are CPU values from other nodes, but they are not significant (PrimProcs are doing something, but not very much).
3. This was tested on 3 node cluster only. It needs to be first examined on a single node.
4. Please view attached Excel workbook, which shows all the metrics obtained for all query variants.