Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
Running
- SELECT DISTINCT idx1 FROM t1
- SELECT idx1 FROM t1 GROUP BY idx1
if idx1 as low cardinality it is always more interesting to map reduce the query as most of the data will be reduce first in the backend using the index, sending less network traffic to the spider node
Workaround was identify by using
- SELECT DISTINCT idx1 FROM t1 ORDER BY idx1 limit 100000
- Using spider_direct_sql UDF