Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26345

SELECT MIN on Spider table returns more rows than expected

Details

    Description

      How to reproduce:

      SET @@session.spider_same_server_link = ON;
       
       
      GRANT ALL PRIVILEGES ON *.* TO 'spinne'@'127.0.0.1'  IDENTIFIED BY 'Widow2021!';
       
      DROP SERVER data1;
       
      CREATE SERVER IF NOT EXISTS data1
       
      FOREIGN DATA WRAPPER mysql
      OPTIONS(
      HOST '127.0.0.1',
      DATABASE 'test',
      USER 'spinne',
      PORT 3307,
      PASSWORD 'Widow2021!'
      );
       
      DROP DATABASE IF EXISTS auto_test_local;
      DROP DATABASE IF EXISTS auto_test_remote;
       
      CREATE DATABASE auto_test_local;
      USE auto_test_local;
      CREATE TABLE IF NOT EXISTS `sp` (
        `c1` varchar(10) NOT NULL,
        `c2` varchar(17) NOT NULL,
        `c3` datetime NOT NULL DEFAULT '0001-01-01 00:00:00',
        PRIMARY KEY (`c1`,`c2`,`c3`)
      ) ENGINE=SPIDER DEFAULT CHARSET=utf8 COMMENT='wrapper "mariadb", table "sp"'
      PARTITION BY LIST COLUMNS(`c2`)
      (PARTITION `pt1` DEFAULT COMMENT = 'srv "data1"' ENGINE = SPIDER);
       
      CREATE DATABASE auto_test_remote;
      USE auto_test_remote;
      CREATE TABLE IF NOT EXISTS `sp` (
        `c1` varchar(10) NOT NULL,
        `c2` varchar(17) NOT NULL,
        `c3` datetime NOT NULL DEFAULT '0001-01-01 00:00:00',
        PRIMARY KEY (`c1`,`c2`,`c3`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8
       PARTITION BY RANGE  COLUMNS(`c3`)
      (PARTITION `pmax` VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB);
       
      INSERT INTO auto_test_remote.sp VALUES
      ('00166','1','2020-05-05 00:00:00'),
      ('00166','2','2020-05-03 00:00:00'),
      ('00166','3','2020-05-02 00:00:00'),
      ('00166','4','2020-05-01 00:00:00'),
      ('00166','5','2020-05-06 00:00:00'),
      ('00174','6','2020-05-06 00:00:00'),
      ('00174','7','2020-05-06 00:00:00'),
      ('00174','8','2020-05-04 00:00:00');
       
      SELECT MIN(c2),c1 FROM auto_test_local.sp WHERE c1='00166';
      

      Result of c2 is not correct, resultset have 5 times the same value for c2.

      MariaDB > SELECT MIN(c2),c1 FROM auto_test_local.sp WHERE c1='00166';
      +---------+-------+
      | MIN(c2) | c1    |
      +---------+-------+
      | 1       | 00166 |
      | 1       | 00166 |
      | 1       | 00166 |
      | 1       | 00166 |
      | 1       | 00166 |
      +---------+-------+
      5 rows in set (0.042 sec)
      

      Attachments

        Issue Links

          Activity

            ycp Yuchen Pei added a comment - - edited

            The idea of adding distinct in case of implicit groupin (as in nayuta's patch) is incorrect, and here's a counterexample:

            create table t2 (a int, b int, c int, PRIMARY KEY (a, b));
            create table t1 (a int, b int, c int, PRIMARY KEY (a, b)) ENGINE=Spider
            COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';
            insert into t2 VALUES (1,4,1), (1,2,2), (2,11,3);
            SELECT MIN(b), a, c FROM t1 WHERE a=1;
            # should be one row, but we get instead
            # MIN(b)	a	c
            # 2	1	2
            # 2	1	1

            The real issue is the spider GBH skipping of const items (MIN(b) in the example) and sending something like select distinct a, c from t1 ....

            The natural fix would be that GBH does not exclude const items in the queries sent to the data node (in this test case, it would mean the implicit grouping in the query sent by spider, causing correctly only one row to be returned). It would also require spider to discard the result from the const item select when storing results to the temp table fields, because the sql layer does not include these fields in the temp table construction, and not discarding will cause a shift to the right.

            To summarise, spider GBH can fail when selecting auxiliary fields (see the previous comment), not skipping constant (will cause a shift to the right in the result of the tests of this issue, as well as those of MDEV-29502/MDEV-20502), or skipping constant (the testcase in this comment). These issues are illustrated as follows:

            C: const; x, y, z: original select; a, b: auxiliary;
             
            query->select:         a C b x y C z
            send res join->fields:       x y C z
            temp table->field:     a   b x y   z
            spider execute select: a   b x y   z
            spider store results:  a   b x y   z

            To remove the fragility, it seems that a sensible solution is to make spider send a query as faithful to the original query as possible, i.e.

            spider execute select:       x y C z
            spider store results:        x y   z

            Note also MDEV-33704 where the change was that group by handlers are skipped when all tables are const.

            ycp Yuchen Pei added a comment - - edited The idea of adding distinct in case of implicit groupin (as in nayuta's patch) is incorrect, and here's a counterexample: create table t2 (a int , b int , c int , PRIMARY KEY (a, b)); create table t1 (a int , b int , c int , PRIMARY KEY (a, b)) ENGINE=Spider COMMENT= 'WRAPPER "mysql", srv "srv",TABLE "t2"' ; insert into t2 VALUES (1,4,1), (1,2,2), (2,11,3); SELECT MIN (b), a, c FROM t1 WHERE a=1; # should be one row, but we get instead # MIN (b) a c # 2 1 2 # 2 1 1 The real issue is the spider GBH skipping of const items ( MIN(b) in the example) and sending something like select distinct a, c from t1 ... . The natural fix would be that GBH does not exclude const items in the queries sent to the data node (in this test case, it would mean the implicit grouping in the query sent by spider, causing correctly only one row to be returned). It would also require spider to discard the result from the const item select when storing results to the temp table fields, because the sql layer does not include these fields in the temp table construction, and not discarding will cause a shift to the right. To summarise, spider GBH can fail when selecting auxiliary fields (see the previous comment), not skipping constant (will cause a shift to the right in the result of the tests of this issue, as well as those of MDEV-29502 / MDEV-20502 ), or skipping constant (the testcase in this comment). These issues are illustrated as follows: C: const; x, y, z: original select; a, b: auxiliary;   query->select: a C b x y C z send res join->fields: x y C z temp table->field: a b x y z spider execute select: a b x y z spider store results: a b x y z To remove the fragility, it seems that a sensible solution is to make spider send a query as faithful to the original query as possible, i.e. spider execute select: x y C z spider store results: x y z Note also MDEV-33704 where the change was that group by handlers are skipped when all tables are const.
            ycp Yuchen Pei added a comment - - edited

            I have almost implemented the idea in the previous comment at 4a4c29315892c4b9bc95e64e07d2099000949450, which is the parent commit of the current bb-10.5-ycp-mdev-26345-1 branch.

            By "almost" I mean it does 1. skip selecting auxiliary items, and 2. select constant items, and skip storing results from constant items. However, it does 1 and 2 separately, and thus does not cover the case where there is a constant item among the auxiliary items. That is something to address later, if such cases exist.

            The change causes spurious regressions in executing queries with order by and probably group by in the test spider/bugfix.mdev_24517. The real problem is MDEV-29546, where spider could not construct the order by part that refers to the correct items. This is because the call to JOIN::set_items_ref_array() replaces the original order-by Item's with Item_temptable_field's, which contains no reference to the original.

            // in do_select():
              if (join->pushdown_query)
              {
                /* Select fields are in the temporary table */
                join->fields= &join->tmp_fields_list1;
                /* Setup HAVING to work with fields in temporary table */
                join->set_items_ref_array(join->items1);
            

            It is tempting to remove this call, but such removal (see bb-10.5-ycp-mdev-26345-1 b4eb1075995027a13c32375d343b1737600bc17c) will cause regression in executing queries with HAVING, e.g. the test sequence.group_by (the sequence SE is the only other user of the group by handler in the mariadb codebase).

            It is easy to get the same failure using spider, like the following test case.

            --disable_query_log
            --disable_result_log
            --source ../../t/test_init.inc
            --enable_result_log
            --enable_query_log
             
            set spider_same_server_link= 1;
            evalp CREATE SERVER srv FOREIGN DATA WRAPPER mysql
            OPTIONS (SOCKET "$MASTER_1_MYSOCK", DATABASE 'test',user 'root');
             
            create table t2 (c int);
            create table t1 (c int) ENGINE=Spider
            COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';
            insert into t1 values (1),(3),(5),(7),(9),(11),(13),(15);
             
            # will output empty, instead of expected 8
            select count(c) as d from t1 having d > 5;
             
            drop table t1, t2;
            drop server srv;
             
            --disable_query_log
            --disable_result_log
            --source ../../t/test_deinit.inc
            --enable_result_log
            --enable_query_log
            

            This fails because the call to create_tmp_table creates a tmp table with the Item_temptable_field representing count(c), instead of the original Item_sum_count in HAVING. Spider then stores the result into the temp table fields, including the Item_temptable_field. So without substituting the Item_sum_count in HAVING through the call join->set_items_ref_array(join->items1), the HAVING condition evaluates to false using the default value 0 which is smaller than 5.

            To fix this, I can think of several possibilities:

            1. Restore the call to set_items_ref_array, but modify the function to only replace HAVING items and not ORDER BY or GROUP BY items.
            2. Purely spider changes: restore the call to set_items_ref_array, and keep a saved copy of the original ORDER BY and GROUP BY items at optimization stage, and use them to construct queries at the execution stage.
            3. Make the server layer pass the original query to group by handlers. No adding auxiliary fields, no skipping const items in the tmp table, no replacing order by / group by / having items with temp table fields (actually does that mean no temp table at all?)

            The third possibility makes sense at a high level. Once a Pushdown_query with a group by handler is created, the execution is delegated to the GBH: in do_select() once we get in the if (join->pushdown_query) it will return from there.

            Quick update: it looks like possibility 2 is easy to achieve, as one could access the original order item with ORDER::item_ptr, instead of ORDER::item. The idea is implemented in the patch accordingly - currently bb-10.5-ycp-mdev-26345 5c048dd71498cf3ed51b985d81b25a7842e89e5f

            ycp Yuchen Pei added a comment - - edited I have almost implemented the idea in the previous comment at 4a4c29315892c4b9bc95e64e07d2099000949450, which is the parent commit of the current bb-10.5-ycp-mdev-26345-1 branch. By "almost" I mean it does 1. skip selecting auxiliary items, and 2. select constant items, and skip storing results from constant items. However, it does 1 and 2 separately, and thus does not cover the case where there is a constant item among the auxiliary items. That is something to address later, if such cases exist. The change causes spurious regressions in executing queries with order by and probably group by in the test spider/bugfix.mdev_24517 . The real problem is MDEV-29546 , where spider could not construct the order by part that refers to the correct items. This is because the call to JOIN::set_items_ref_array() replaces the original order-by Item 's with Item_temptable_field 's, which contains no reference to the original. // in do_select(): if (join->pushdown_query) { /* Select fields are in the temporary table */ join->fields= &join->tmp_fields_list1; /* Setup HAVING to work with fields in temporary table */ join->set_items_ref_array(join->items1); It is tempting to remove this call, but such removal (see bb-10.5-ycp-mdev-26345-1 b4eb1075995027a13c32375d343b1737600bc17c) will cause regression in executing queries with HAVING , e.g. the test sequence.group_by (the sequence SE is the only other user of the group by handler in the mariadb codebase). It is easy to get the same failure using spider, like the following test case. --disable_query_log --disable_result_log --source ../../t/test_init.inc --enable_result_log --enable_query_log   set spider_same_server_link= 1; evalp CREATE SERVER srv FOREIGN DATA WRAPPER mysql OPTIONS (SOCKET "$MASTER_1_MYSOCK" , DATABASE 'test' , user 'root' );   create table t2 (c int ); create table t1 (c int ) ENGINE=Spider COMMENT= 'WRAPPER "mysql", srv "srv",TABLE "t2"' ; insert into t1 values (1),(3),(5),(7),(9),(11),(13),(15);   # will output empty, instead of expected 8 select count (c) as d from t1 having d > 5;   drop table t1, t2; drop server srv;   --disable_query_log --disable_result_log --source ../../t/test_deinit.inc --enable_result_log --enable_query_log This fails because the call to create_tmp_table creates a tmp table with the Item_temptable_field representing count(c) , instead of the original Item_sum_count in HAVING . Spider then stores the result into the temp table fields, including the Item_temptable_field . So without substituting the Item_sum_count in HAVING through the call join->set_items_ref_array(join->items1) , the HAVING condition evaluates to false using the default value 0 which is smaller than 5. To fix this, I can think of several possibilities: 1. Restore the call to set_items_ref_array , but modify the function to only replace HAVING items and not ORDER BY or GROUP BY items. 2. Purely spider changes: restore the call to set_items_ref_array , and keep a saved copy of the original ORDER BY and GROUP BY items at optimization stage, and use them to construct queries at the execution stage. 3. Make the server layer pass the original query to group by handlers. No adding auxiliary fields, no skipping const items in the tmp table, no replacing order by / group by / having items with temp table fields (actually does that mean no temp table at all?) The third possibility makes sense at a high level. Once a Pushdown_query with a group by handler is created, the execution is delegated to the GBH: in do_select() once we get in the if (join->pushdown_query) it will return from there. Quick update: it looks like possibility 2 is easy to achieve, as one could access the original order item with ORDER::item_ptr , instead of ORDER::item . The idea is implemented in the patch accordingly - currently bb-10.5-ycp-mdev-26345 5c048dd71498cf3ed51b985d81b25a7842e89e5f
            ycp Yuchen Pei added a comment - - edited

            Hi holyfoot, ptal thanks

            0783c158f37 upstream/bb-10.5-ycp-mdev-26345 MDEV-26345 Spider GBH should execute original queries on the data node
            219b64321bc MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge
            2efcfbc9bf2 MDEV-26912 Spider: Remove dead code related to Oracle OCI
            

            The first commit is the main patch. The second commit is a fixup. The third commit is a backport of MDEV-26912 to 10.5, see MDEV-34100. It is good to do it now because we are making changes to the general spider code as well as the spider mariadb backend code without making changes in spd_db_oracle.cc accordingly, see also the description in MDEV-34100.

            Result file changes in mdev_20502.result and timestamp.result may look a bit peculiar at first glance. They are caused by the change to no longer skip SELECTing const. In mdev_20502, this means the const subquery items such as (SELECT tbl_a.val+1 FROM tbl_a) are now considered to decide whether to create a spider gbh, and since subquery items are not supported, the gbh is no longer created. For timestamp, this means the only (const) item (timestamp('2018-06-25' , '10:43:21')) is not skipped in constructing the SELECT part of the query, thus the following is not entered that resulted in a SELECT 1:

              if (begin == str->length())
              {
                /* no columns */
                if (str->reserve(SPIDER_SQL_ONE_LEN))
                {
                  DBUG_RETURN(HA_ERR_OUT_OF_MEM);
                }
                str->q_append(SPIDER_SQL_ONE_STR, SPIDER_SQL_ONE_LEN);
            

            ycp Yuchen Pei added a comment - - edited Hi holyfoot , ptal thanks 0783c158f37 upstream/bb-10.5-ycp-mdev-26345 MDEV-26345 Spider GBH should execute original queries on the data node 219b64321bc MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge 2efcfbc9bf2 MDEV-26912 Spider: Remove dead code related to Oracle OCI The first commit is the main patch. The second commit is a fixup. The third commit is a backport of MDEV-26912 to 10.5, see MDEV-34100 . It is good to do it now because we are making changes to the general spider code as well as the spider mariadb backend code without making changes in spd_db_oracle.cc accordingly, see also the description in MDEV-34100 . Result file changes in mdev_20502.result and timestamp.result may look a bit peculiar at first glance. They are caused by the change to no longer skip SELECTing const. In mdev_20502, this means the const subquery items such as (SELECT tbl_a.val+1 FROM tbl_a) are now considered to decide whether to create a spider gbh, and since subquery items are not supported, the gbh is no longer created. For timestamp, this means the only (const) item (timestamp('2018-06-25' , '10:43:21')) is not skipped in constructing the SELECT part of the query, thus the following is not entered that resulted in a SELECT 1 : if (begin == str->length()) { /* no columns */ if (str->reserve(SPIDER_SQL_ONE_LEN)) { DBUG_RETURN(HA_ERR_OUT_OF_MEM); } str->q_append(SPIDER_SQL_ONE_STR, SPIDER_SQL_ONE_LEN);

            ok to push.

            holyfoot Alexey Botchkov added a comment - ok to push.
            ycp Yuchen Pei added a comment - - edited

            Thanks for the review - pushed the following to 10.5

            77ed235d50b upstream/bb-10.5-ycp-push 10.5 MDEV-26345 Spider GBH should execute original queries on the data node
            e6daff40e47 MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge
            6080e3af19e MDEV-26912 Spider: Remove dead code related to Oracle OCI
            

            From here onwards, the spider gbh should always execute the original query to the data node. That is, the choice is between: EITHER creating the gbh to execute the original query at the data node, OR not creating the gbh at all because it is not expected to perform better than the usual handler.

            Also (copying over from slack), even if spider executes only one query at the data node, for aggregate functions it may read one row at a time without direct aggregate, in which case the gbh could work better than the usual handler. For example, consider the following

            create table t2 (c int);
            create table t1 (c int) ENGINE=Spider
            COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';
             
            insert into t1 values (1), (2), (3), (4), (5);
            set spider_direct_aggregate=0;
            select count(*) from t1;
            

            direct aggregate is turned off, and rnd_next() is called 5 times to read each of the 5 rows. It is not guaranteed that a storage engine knows to do direct aggregate with its usual handler, in which case the group by handler may work better.

            ycp Yuchen Pei added a comment - - edited Thanks for the review - pushed the following to 10.5 77ed235d50b upstream/bb-10.5-ycp-push 10.5 MDEV-26345 Spider GBH should execute original queries on the data node e6daff40e47 MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge 6080e3af19e MDEV-26912 Spider: Remove dead code related to Oracle OCI From here onwards, the spider gbh should always execute the original query to the data node. That is, the choice is between: EITHER creating the gbh to execute the original query at the data node, OR not creating the gbh at all because it is not expected to perform better than the usual handler. Also (copying over from slack), even if spider executes only one query at the data node, for aggregate functions it may read one row at a time without direct aggregate, in which case the gbh could work better than the usual handler. For example, consider the following create table t2 (c int ); create table t1 (c int ) ENGINE=Spider COMMENT= 'WRAPPER "mysql", srv "srv",TABLE "t2"' ;   insert into t1 values (1), (2), (3), (4), (5); set spider_direct_aggregate=0; select count (*) from t1; direct aggregate is turned off, and rnd_next() is called 5 times to read each of the 5 rows. It is not guaranteed that a storage engine knows to do direct aggregate with its usual handler, in which case the group by handler may work better.

            People

              ycp Yuchen Pei
              Richard Richard Stracke
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.