[MDEV-26345] SELECT MIN on Spider table returns more rows than expected - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.5.12, 10.4(EOL), 10.6
Fix Version/s: 10.5.27, 10.6.20, 10.11.10, 11.2.6, 11.4.4
Component/s: Storage Engine - Spider
Labels:

Description

How to reproduce:

SET @@session.spider_same_server_link = ON;

GRANT ALL PRIVILEGES ON *.* TO 'spinne'@'127.0.0.1'  IDENTIFIED BY 'Widow2021!';

DROP SERVER data1;

CREATE SERVER IF NOT EXISTS data1

FOREIGN DATA WRAPPER mysql

OPTIONS(

HOST '127.0.0.1',

DATABASE 'test',

USER 'spinne',

PORT 3307,

PASSWORD 'Widow2021!'

);

DROP DATABASE IF EXISTS auto_test_local;

DROP DATABASE IF EXISTS auto_test_remote;

CREATE DATABASE auto_test_local;

USE auto_test_local;

CREATE TABLE IF NOT EXISTS `sp` (

  `c1` varchar(10) NOT NULL,

  `c2` varchar(17) NOT NULL,

  `c3` datetime NOT NULL DEFAULT '0001-01-01 00:00:00',

  PRIMARY KEY (`c1`,`c2`,`c3`)

) ENGINE=SPIDER DEFAULT CHARSET=utf8 COMMENT='wrapper "mariadb", table "sp"'

PARTITION BY LIST COLUMNS(`c2`)

(PARTITION `pt1` DEFAULT COMMENT = 'srv "data1"' ENGINE = SPIDER);

CREATE DATABASE auto_test_remote;

USE auto_test_remote;

CREATE TABLE IF NOT EXISTS `sp` (

  `c1` varchar(10) NOT NULL,

  `c2` varchar(17) NOT NULL,

  `c3` datetime NOT NULL DEFAULT '0001-01-01 00:00:00',

  PRIMARY KEY (`c1`,`c2`,`c3`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

 PARTITION BY RANGE  COLUMNS(`c3`)

(PARTITION `pmax` VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB);

INSERT INTO auto_test_remote.sp VALUES

('00166','1','2020-05-05 00:00:00'),

('00166','2','2020-05-03 00:00:00'),

('00166','3','2020-05-02 00:00:00'),

('00166','4','2020-05-01 00:00:00'),

('00166','5','2020-05-06 00:00:00'),

('00174','6','2020-05-06 00:00:00'),

('00174','7','2020-05-06 00:00:00'),

('00174','8','2020-05-04 00:00:00');

SELECT MIN(c2),c1 FROM auto_test_local.sp WHERE c1='00166';

Result of c2 is not correct, resultset have 5 times the same value for c2.

MariaDB > SELECT MIN(c2),c1 FROM auto_test_local.sp WHERE c1='00166';

+---------+-------+

| MIN(c2) | c1    |

+---------+-------+

| 1       | 00166 |

| 1       | 00166 |

| 1       | 00166 |

| 1       | 00166 |

| 1       | 00166 |

+---------+-------+

5 rows in set (0.042 sec)

Attachments

Issue Links

causes

MDEV-35422 mysqld got signal 11 after SELECT DISTINCT

Closed

MDEV-35874 Unexpected error 1264 'Out of Range Value for Column' when inserting into ... select ... from a spider table

Closed

is blocked by

MDEV-33100 Backport MDEV-20468 to 10.4

Closed

is caused by

MDEV-20502 Queries against spider tables return wrong values for columns following constant declarations.

Closed

relates to

MDEV-29546 spider group by handler wrong result on order by aggregate

Closed

Activity

Ascending order - Click to sort in descending order

View 27 older comments

Yuchen Pei added a comment - 2024-09-26 02:04 - edited

The idea of adding distinct in case of implicit groupin (as in nayuta's patch) is incorrect, and here's a counterexample:

create table t2 (a int, b int, c int, PRIMARY KEY (a, b));

create table t1 (a int, b int, c int, PRIMARY KEY (a, b)) ENGINE=Spider

COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';

insert into t2 VALUES (1,4,1), (1,2,2), (2,11,3);

SELECT MIN(b), a, c FROM t1 WHERE a=1;

# should be one row, but we get instead

# MIN(b)	a	c

# 2	1	2

# 2	1	1

The real issue is the spider GBH skipping of const items (MIN(b) in the example) and sending something like select distinct a, c from t1 ....

The natural fix would be that GBH does not exclude const items in the queries sent to the data node (in this test case, it would mean the implicit grouping in the query sent by spider, causing correctly only one row to be returned). It would also require spider to discard the result from the const item select when storing results to the temp table fields, because the sql layer does not include these fields in the temp table construction, and not discarding will cause a shift to the right.

To summarise, spider GBH can fail when selecting auxiliary fields (see the previous comment), not skipping constant (will cause a shift to the right in the result of the tests of this issue, as well as those of ~~MDEV-29502~~/~~MDEV-20502~~), or skipping constant (the testcase in this comment). These issues are illustrated as follows:

C: const; x, y, z: original select; a, b: auxiliary;

query->select:         a C b x y C z

send res join->fields:       x y C z

temp table->field:     a   b x y   z

spider execute select: a   b x y   z

spider store results:  a   b x y   z

To remove the fragility, it seems that a sensible solution is to make spider send a query as faithful to the original query as possible, i.e.

spider execute select:       x y C z

spider store results:        x y   z

Note also ~~MDEV-33704~~ where the change was that group by handlers are skipped when all tables are const.

Yuchen Pei added a comment - 2024-09-26 02:04 - edited The idea of adding distinct in case of implicit groupin (as in nayuta's patch) is incorrect, and here's a counterexample: create table t2 (a int , b int , c int , PRIMARY KEY (a, b)); create table t1 (a int , b int , c int , PRIMARY KEY (a, b)) ENGINE=Spider COMMENT= 'WRAPPER "mysql", srv "srv",TABLE "t2"' ; insert into t2 VALUES (1,4,1), (1,2,2), (2,11,3); SELECT MIN (b), a, c FROM t1 WHERE a=1; # should be one row, but we get instead # MIN (b) a c # 2 1 2 # 2 1 1 The real issue is the spider GBH skipping of const items ( MIN(b) in the example) and sending something like select distinct a, c from t1 ... . The natural fix would be that GBH does not exclude const items in the queries sent to the data node (in this test case, it would mean the implicit grouping in the query sent by spider, causing correctly only one row to be returned). It would also require spider to discard the result from the const item select when storing results to the temp table fields, because the sql layer does not include these fields in the temp table construction, and not discarding will cause a shift to the right. To summarise, spider GBH can fail when selecting auxiliary fields (see the previous comment), not skipping constant (will cause a shift to the right in the result of the tests of this issue, as well as those of MDEV-29502 / MDEV-20502 ), or skipping constant (the testcase in this comment). These issues are illustrated as follows: C: const; x, y, z: original select; a, b: auxiliary; query->select: a C b x y C z send res join->fields: x y C z temp table->field: a b x y z spider execute select: a b x y z spider store results: a b x y z To remove the fragility, it seems that a sensible solution is to make spider send a query as faithful to the original query as possible, i.e. spider execute select: x y C z spider store results: x y z Note also MDEV-33704 where the change was that group by handlers are skipped when all tables are const.

Yuchen Pei added a comment - 2024-10-11 05:10 - edited

I have almost implemented the idea in the previous comment at 4a4c29315892c4b9bc95e64e07d2099000949450, which is the parent commit of the current bb-10.5-ycp-mdev-26345-1 branch.

By "almost" I mean it does 1. skip selecting auxiliary items, and 2. select constant items, and skip storing results from constant items. However, it does 1 and 2 separately, and thus does not cover the case where there is a constant item among the auxiliary items. That is something to address later, if such cases exist.

The change causes spurious regressions in executing queries with order by and probably group by in the test spider/bugfix.mdev_24517. The real problem is ~~MDEV-29546~~, where spider could not construct the order by part that refers to the correct items. This is because the call to JOIN::set_items_ref_array() replaces the original order-by Item's with Item_temptable_field's, which contains no reference to the original.

// in do_select():

  if (join->pushdown_query)

    /* Select fields are in the temporary table */

    join->fields= &join->tmp_fields_list1;

    /* Setup HAVING to work with fields in temporary table */

    join->set_items_ref_array(join->items1);

It is tempting to remove this call, but such removal (see bb-10.5-ycp-mdev-26345-1 b4eb1075995027a13c32375d343b1737600bc17c) will cause regression in executing queries with HAVING, e.g. the test sequence.group_by (the sequence SE is the only other user of the group by handler in the mariadb codebase).

It is easy to get the same failure using spider, like the following test case.

--disable_query_log

--disable_result_log

--source ../../t/test_init.inc

--enable_result_log

--enable_query_log

set spider_same_server_link= 1;

evalp CREATE SERVER srv FOREIGN DATA WRAPPER mysql

OPTIONS (SOCKET "$MASTER_1_MYSOCK", DATABASE 'test',user 'root');

create table t2 (c int);

create table t1 (c int) ENGINE=Spider

COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';

insert into t1 values (1),(3),(5),(7),(9),(11),(13),(15);

# will output empty, instead of expected 8

select count(c) as d from t1 having d > 5;

drop table t1, t2;

drop server srv;

--disable_query_log

--disable_result_log

--source ../../t/test_deinit.inc

--enable_result_log

--enable_query_log

This fails because the call to create_tmp_table creates a tmp table with the Item_temptable_field representing count(c), instead of the original Item_sum_count in HAVING. Spider then stores the result into the temp table fields, including the Item_temptable_field. So without substituting the Item_sum_count in HAVING through the call join->set_items_ref_array(join->items1), the HAVING condition evaluates to false using the default value 0 which is smaller than 5.

To fix this, I can think of several possibilities:

1. Restore the call to set_items_ref_array, but modify the function to only replace HAVING items and not ORDER BY or GROUP BY items.
2. Purely spider changes: restore the call to set_items_ref_array, and keep a saved copy of the original ORDER BY and GROUP BY items at optimization stage, and use them to construct queries at the execution stage.
3. Make the server layer pass the original query to group by handlers. No adding auxiliary fields, no skipping const items in the tmp table, no replacing order by / group by / having items with temp table fields (actually does that mean no temp table at all?)

The third possibility makes sense at a high level. Once a Pushdown_query with a group by handler is created, the execution is delegated to the GBH: in do_select() once we get in the if (join->pushdown_query) it will return from there.

Quick update: it looks like possibility 2 is easy to achieve, as one could access the original order item with ORDER::item_ptr, instead of ORDER::item. The idea is implemented in the patch accordingly - currently bb-10.5-ycp-mdev-26345 5c048dd71498cf3ed51b985d81b25a7842e89e5f

Yuchen Pei added a comment - 2024-10-11 05:10 - edited I have almost implemented the idea in the previous comment at 4a4c29315892c4b9bc95e64e07d2099000949450, which is the parent commit of the current bb-10.5-ycp-mdev-26345-1 branch. By "almost" I mean it does 1. skip selecting auxiliary items, and 2. select constant items, and skip storing results from constant items. However, it does 1 and 2 separately, and thus does not cover the case where there is a constant item among the auxiliary items. That is something to address later, if such cases exist. The change causes spurious regressions in executing queries with order by and probably group by in the test spider/bugfix.mdev_24517 . The real problem is MDEV-29546 , where spider could not construct the order by part that refers to the correct items. This is because the call to JOIN::set_items_ref_array() replaces the original order-by Item 's with Item_temptable_field 's, which contains no reference to the original. // in do_select(): if (join->pushdown_query) { /* Select fields are in the temporary table */ join->fields= &join->tmp_fields_list1; /* Setup HAVING to work with fields in temporary table */ join->set_items_ref_array(join->items1); It is tempting to remove this call, but such removal (see bb-10.5-ycp-mdev-26345-1 b4eb1075995027a13c32375d343b1737600bc17c) will cause regression in executing queries with HAVING , e.g. the test sequence.group_by (the sequence SE is the only other user of the group by handler in the mariadb codebase). It is easy to get the same failure using spider, like the following test case. --disable_query_log --disable_result_log --source ../../t/test_init.inc --enable_result_log --enable_query_log set spider_same_server_link= 1; evalp CREATE SERVER srv FOREIGN DATA WRAPPER mysql OPTIONS (SOCKET "$MASTER_1_MYSOCK" , DATABASE 'test' , user 'root' ); create table t2 (c int ); create table t1 (c int ) ENGINE=Spider COMMENT= 'WRAPPER "mysql", srv "srv",TABLE "t2"' ; insert into t1 values (1),(3),(5),(7),(9),(11),(13),(15); # will output empty, instead of expected 8 select count (c) as d from t1 having d > 5; drop table t1, t2; drop server srv; --disable_query_log --disable_result_log --source ../../t/test_deinit.inc --enable_result_log --enable_query_log This fails because the call to create_tmp_table creates a tmp table with the Item_temptable_field representing count(c) , instead of the original Item_sum_count in HAVING . Spider then stores the result into the temp table fields, including the Item_temptable_field . So without substituting the Item_sum_count in HAVING through the call join->set_items_ref_array(join->items1) , the HAVING condition evaluates to false using the default value 0 which is smaller than 5. To fix this, I can think of several possibilities: 1. Restore the call to set_items_ref_array , but modify the function to only replace HAVING items and not ORDER BY or GROUP BY items. 2. Purely spider changes: restore the call to set_items_ref_array , and keep a saved copy of the original ORDER BY and GROUP BY items at optimization stage, and use them to construct queries at the execution stage. 3. Make the server layer pass the original query to group by handlers. No adding auxiliary fields, no skipping const items in the tmp table, no replacing order by / group by / having items with temp table fields (actually does that mean no temp table at all?) The third possibility makes sense at a high level. Once a Pushdown_query with a group by handler is created, the execution is delegated to the GBH: in do_select() once we get in the if (join->pushdown_query) it will return from there. Quick update: it looks like possibility 2 is easy to achieve, as one could access the original order item with ORDER::item_ptr , instead of ORDER::item . The idea is implemented in the patch accordingly - currently bb-10.5-ycp-mdev-26345 5c048dd71498cf3ed51b985d81b25a7842e89e5f

Yuchen Pei added a comment - 2024-10-14 04:57 - edited

Hi holyfoot, ptal thanks

0783c158f37 upstream/bb-10.5-ycp-mdev-26345 MDEV-26345 Spider GBH should execute original queries on the data node

219b64321bc MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge

2efcfbc9bf2 MDEV-26912 Spider: Remove dead code related to Oracle OCI

The first commit is the main patch. The second commit is a fixup. The third commit is a backport of ~~MDEV-26912~~ to 10.5, see ~~MDEV-34100~~. It is good to do it now because we are making changes to the general spider code as well as the spider mariadb backend code without making changes in spd_db_oracle.cc accordingly, see also the description in ~~MDEV-34100~~.

Result file changes in mdev_20502.result and timestamp.result may look a bit peculiar at first glance. They are caused by the change to no longer skip SELECTing const. In mdev_20502, this means the const subquery items such as (SELECT tbl_a.val+1 FROM tbl_a) are now considered to decide whether to create a spider gbh, and since subquery items are not supported, the gbh is no longer created. For timestamp, this means the only (const) item (timestamp('2018-06-25' , '10:43:21')) is not skipped in constructing the SELECT part of the query, thus the following is not entered that resulted in a SELECT 1:

  if (begin == str->length())

    /* no columns */

    if (str->reserve(SPIDER_SQL_ONE_LEN))

      DBUG_RETURN(HA_ERR_OUT_OF_MEM);

    str->q_append(SPIDER_SQL_ONE_STR, SPIDER_SQL_ONE_LEN);

Yuchen Pei added a comment - 2024-10-14 04:57 - edited Hi holyfoot , ptal thanks 0783c158f37 upstream/bb-10.5-ycp-mdev-26345 MDEV-26345 Spider GBH should execute original queries on the data node 219b64321bc MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge 2efcfbc9bf2 MDEV-26912 Spider: Remove dead code related to Oracle OCI The first commit is the main patch. The second commit is a fixup. The third commit is a backport of MDEV-26912 to 10.5, see MDEV-34100 . It is good to do it now because we are making changes to the general spider code as well as the spider mariadb backend code without making changes in spd_db_oracle.cc accordingly, see also the description in MDEV-34100 . Result file changes in mdev_20502.result and timestamp.result may look a bit peculiar at first glance. They are caused by the change to no longer skip SELECTing const. In mdev_20502, this means the const subquery items such as (SELECT tbl_a.val+1 FROM tbl_a) are now considered to decide whether to create a spider gbh, and since subquery items are not supported, the gbh is no longer created. For timestamp, this means the only (const) item (timestamp('2018-06-25' , '10:43:21')) is not skipped in constructing the SELECT part of the query, thus the following is not entered that resulted in a SELECT 1 : if (begin == str->length()) { /* no columns */ if (str->reserve(SPIDER_SQL_ONE_LEN)) { DBUG_RETURN(HA_ERR_OUT_OF_MEM); } str->q_append(SPIDER_SQL_ONE_STR, SPIDER_SQL_ONE_LEN);

Alexey Botchkov added a comment - 2024-10-14 13:07

ok to push.

Alexey Botchkov added a comment - 2024-10-14 13:07 ok to push.

Yuchen Pei added a comment - 2024-10-15 05:12 - edited

Thanks for the review - pushed the following to 10.5

77ed235d50b upstream/bb-10.5-ycp-push 10.5 MDEV-26345 Spider GBH should execute original queries on the data node

e6daff40e47 MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge

6080e3af19e MDEV-26912 Spider: Remove dead code related to Oracle OCI

From here onwards, the spider gbh should always execute the original query to the data node. That is, the choice is between: EITHER creating the gbh to execute the original query at the data node, OR not creating the gbh at all because it is not expected to perform better than the usual handler.

Also (copying over from slack), even if spider executes only one query at the data node, for aggregate functions it may read one row at a time without direct aggregate, in which case the gbh could work better than the usual handler. For example, consider the following

create table t2 (c int);

create table t1 (c int) ENGINE=Spider

COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';

insert into t1 values (1), (2), (3), (4), (5);

set spider_direct_aggregate=0;

select count(*) from t1;

direct aggregate is turned off, and rnd_next() is called 5 times to read each of the 5 rows. It is not guaranteed that a storage engine knows to do direct aggregate with its usual handler, in which case the group by handler may work better.

Yuchen Pei added a comment - 2024-10-15 05:12 - edited Thanks for the review - pushed the following to 10.5 77ed235d50b upstream/bb-10.5-ycp-push 10.5 MDEV-26345 Spider GBH should execute original queries on the data node e6daff40e47 MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge 6080e3af19e MDEV-26912 Spider: Remove dead code related to Oracle OCI From here onwards, the spider gbh should always execute the original query to the data node. That is, the choice is between: EITHER creating the gbh to execute the original query at the data node, OR not creating the gbh at all because it is not expected to perform better than the usual handler. Also (copying over from slack), even if spider executes only one query at the data node, for aggregate functions it may read one row at a time without direct aggregate, in which case the gbh could work better than the usual handler. For example, consider the following create table t2 (c int ); create table t1 (c int ) ENGINE=Spider COMMENT= 'WRAPPER "mysql", srv "srv",TABLE "t2"' ; insert into t1 values (1), (2), (3), (4), (5); set spider_direct_aggregate=0; select count (*) from t1; direct aggregate is turned off, and rnd_next() is called 5 times to read each of the 5 rows. It is not guaranteed that a storage engine knows to do direct aggregate with its usual handler, in which case the group by handler may work better.

MariaDB Server

SELECT MIN on Spider table returns more rows than expected

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration