To avoid potential partial sort optimisation and maximize resource consumption of ORDER BY, LIMIT keyword was removed:
Queries:
query without subselect:
SELECT l_orderkey
FROM LINEITEM
ORDER BY
RAND(),
EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1);
query with subselect:
SELECT l_orderkey
FROM (
SELECT l_orderkey
FROM LINEITEM
ORDER BY
RAND(),
EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1)
) AS subquery;
10gb dataset was used.
To change columnstore_orderby_threads value I used
SET GLOBAL command, (SET GLOBAL columnstore_orderby_threads = 1; SET GLOBAL columnstore_orderby_threads = 16)
no subselect, with patch and columnstore_orderby_threads == 16
59986052 rows in set (1 min 4.815 sec)
59986052 rows in set (1 min 4.975 sec)
no subselect, with patch and columnstore_orderby_threads == 1
59986052 rows in set (3 min 32.506 sec)
59986052 rows in set (3 min 29.330 sec)
subselect, with patch and columnstore_orderby_threads == 16
59986052 rows in set (3 min 29.684 sec)
59986052 rows in set (3 min 35.516 sec)
subselect, with patch and columnstore_orderby_threads == 1
59986052 rows in set (3 min 33.364 sec)
59986052 rows in set (3 min 27.356 sec)
Change in columnstore_orderby_threads from 16 to 1 only affects processing time of request without subquery, so I am assuming that sorting in requests with subquery is still single threaded always
To avoid potential partial sort optimisation and maximize resource consumption of ORDER BY, LIMIT keyword was removed:
Queries:
query without subselect:
SELECT l_orderkey
FROM LINEITEM
ORDER BY
RAND(),
EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1);
query with subselect:
SELECT l_orderkey
FROM (
SELECT l_orderkey
FROM LINEITEM
ORDER BY
RAND(),
EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1)
) AS subquery;
10gb dataset was used.
To change columnstore_orderby_threads value I used
SET GLOBAL command, (SET GLOBAL columnstore_orderby_threads = 1; SET GLOBAL columnstore_orderby_threads = 16)
no subselect, with patch and columnstore_orderby_threads == 16
59986052 rows in set (1 min 4.815 sec)
59986052 rows in set (1 min 4.975 sec)
no subselect, with patch and columnstore_orderby_threads == 1
59986052 rows in set (3 min 32.506 sec)
59986052 rows in set (3 min 29.330 sec)
subselect, with patch and columnstore_orderby_threads == 16
59986052 rows in set (3 min 29.684 sec)
59986052 rows in set (3 min 35.516 sec)
subselect, with patch and columnstore_orderby_threads == 1
59986052 rows in set (3 min 33.364 sec)
59986052 rows in set (3 min 27.356 sec)
Change in columnstore_orderby_threads from 16 to 1 only affects processing time of request without subquery, so I am assuming that sorting in requests with subquery is still single threaded always