[MDEV-12792] limit queries with spider parallel / bgs_mode on generate unnecessary queries Created: 2017-05-12  Updated: 2023-05-23

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 10.2
Fix Version/s: 10.2

Type: Bug Priority: Minor
Reporter: David Thompson (Inactive) Assignee: Yuchen Pei
Resolution: Unresolved Votes: 0
Labels: None
Environment:

centos 7 2 data nodes



 Description   

using dbt3 or tpch data generator for data / schema. The following query performs faster with spider_bgs_mode 0 than higher values that enable parallel:
select * from orders where o_orderdate='1998-01-10' limit 10;

enabling slow query log to log all queries on the backend, i observe that it is only sending the query to backend1 and not backend2 which is fine but when spider_bgs_mode > 0 it is actually sending 3 queries:

select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-02-05') limit 2;
select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-02-05') limit 2,100;
select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-02-05') limit 102,20;

Changing spider_bgs_first_read to 10 reduces it to 2 queries:

select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-01-10') limit 10;
select `o_orderkey`,`o_custkey`,`o_orderstatus`,`o_totalprice`,`o_orderDATE`,`o_orderpriority`,`o_clerk`,`o_shippriority`,`o_comment` from `dbt3`.`orders` where (`o_orderDATE` = DATE'1998-01-10') limit 10,100;

In both cases there are unnecessary queries generated which are not needed. If you add an order by clause then only 1 query is sent with limit 10 to both nodes which is correct.

This reproduces on 10.2-spider and the spiral arms forked code.



 Comments   
Comment by David Thompson (Inactive) [ 2017-05-12 ]

backend schema:

CREATE TABLE orders (
        o_orderkey BIGINT,
        o_custkey INTEGER,
        o_orderstatus CHAR(1),
        o_totalprice DECIMAL(10,2),
        o_orderDATE DATE,
        o_orderpriority CHAR(15),
        o_clerk CHAR(15),
        o_shippriority INTEGER,
        o_comment VARCHAR(79),
        PRIMARY KEY (o_orderkey)
) engine=InnoDB;
 
create index orders_orderdate on orders(o_orderdate);

Spider schema:

create server backend1 foreign data wrapper mysql options 
(host 'backend1', database 'dbt3', user 'spider', password 'spider', port 3306);
 
create server backend2 foreign data wrapper mysql options 
(host 'backend2', database 'dbt3', user 'spider', password 'spider', port 3306);
 
CREATE TABLE orders (
        o_orderkey BIGINT,
        o_custkey INTEGER,
        o_orderstatus CHAR(1),
        o_totalprice DECIMAL(10,2),
        o_orderDATE DATE,
        o_orderpriority CHAR(15),
        o_clerk CHAR(15),
        o_shippriority INTEGER,
        o_comment VARCHAR(79),
        PRIMARY KEY (o_orderkey)
)
ENGINE=spider
COMMENT='wrapper "mysql", table "orders"'
PARTITION BY HASH(o_orderkey) (
  PARTITION pt1 COMMENT = 'srv "backend1"',
  PARTITION pt2 COMMENT = 'srv "backend2"'
);

Generated at Thu Feb 08 08:00:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.