Details
-
Bug
-
Status: Open (View Workflow)
-
Minor
-
Resolution: Unresolved
-
10.2(EOL)
-
None
-
centos 7 2 data nodes
Description
using dbt3 or tpch data generator for data / schema. The following query performs faster with spider_bgs_mode 0 than higher values that enable parallel:
select * from orders where o_orderdate='1998-01-10' limit 10;
enabling slow query log to log all queries on the backend, i observe that it is only sending the query to backend1 and not backend2 which is fine but when spider_bgs_mode > 0 it is actually sending 3 queries:
select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-02-05') limit 2;
|
select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-02-05') limit 2,100;
|
select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-02-05') limit 102,20;
|
Changing spider_bgs_first_read to 10 reduces it to 2 queries:
select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-01-10') limit 10;
|
select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-01-10') limit 10,100;
|
In both cases there are unnecessary queries generated which are not needed. If you add an order by clause then only 1 query is sent with limit 10 to both nodes which is correct.
This reproduces on 10.2-spider and the spiral arms forked code.