[MDEV-27625] Spider returns two rows when it should only return one Created: 2022-01-26  Updated: 2022-01-28  Resolved: 2022-01-27

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 10.7, 10.8
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Sergei Golubchik
Resolution: Not a Bug Votes: 0
Labels: None

Issue Links:
Relates

 Description   

For this testcase (change port as needed):

INSTALL PLUGIN spider SONAME 'ha_spider.so';
SET SESSION spider_same_server_link=ON;
DROP DATABASE test;
CREATE DATABASE test;
USE test;
CREATE TABLE t_base (c INT) ENGINE=InnoDB;
CREATE TABLE t_link (c INT) ENGINE=Spider COMMENT "HOST '127.0.0.1', PORT '30035', USER 'root', PASSWORD '', DATABASE 'test', TABLE 't_base'"
PARTITION BY RANGE (c)
(PARTITION p1 VALUES LESS THAN (100) ENGINE = Spider,
 PARTITION p2 VALUES LESS THAN MAXVALUE ENGINE = Spider);
INSERT INTO t_link VALUES (1);
SELECT * FROM t_link;

There are two rows when queried (whilst only a single one was inserted!):

10.8.0-opt>SELECT * FROM t_link;
+------+
| c    |
+------+
|    1 |
|    1 |
+------+
2 rows in set (0.002 sec)

In comparison, this similar testcase:

INSTALL PLUGIN spider SONAME 'ha_spider.so';
SET SESSION spider_same_server_link=ON;
DROP DATABASE test;
CREATE DATABASE test;
DROP DATABASE test2;
CREATE DATABASE test2;
USE test;
CREATE TABLE t_base (c INT) ENGINE=InnoDB;
CREATE TABLE test2.t_base2 (c INT) ENGINE=InnoDB;
CREATE TABLE t_link (c INT) ENGINE=SPIDER COMMENT "HOST '127.0.0.1', PORT '30035', USER 'root', PASSWORD ''"
PARTITION BY RANGE (c)
(PARTITION p1 VALUES LESS THAN (100) ENGINE = Spider COMMENT="DATABASE 'test', TABLE 't_base'",
 PARTITION p2 VALUES LESS THAN MAXVALUE ENGINE = Spider COMMENT="DATABASE 'test2', TABLE 't_base2'");
INSERT INTO t_link VALUES (1);
SELECT * FROM t_link;

Yields one row:

10.8.0-opt>SELECT * FROM t_link;
+------+
| c    |
+------+
|    1 |
+------+
1 row in set (0.001 sec)

As discussed with nayuta-yanagisawa, this would seem to be is intended behavior. Spider implicitly assumes that each partition is associated with different tables. However, Spider doesn't check the constraint whilst it should. Likely present in all versions. Proposal to fix in 10.7.



 Comments   
Comment by Sergei Golubchik [ 2022-01-26 ]

Doesn't look like a bug to me. What constraint should Spider check but doesn't?

Comment by Roel Van de Paar [ 2022-01-27 ]

IMHO, Spider should ensure that data is not double-read from the same table irrespective of the table definition. The first resultset is incorrect imho.

10.8.0-dbg>USE test;
Database changed
10.8.0-dbg>CREATE TABLE t_link (c INT) ENGINE=InnoDB
    -> PARTITION BY RANGE (c)
    -> (PARTITION p1 VALUES LESS THAN (100),
    ->  PARTITION p2 VALUES LESS THAN MAXVALUE);
Query OK, 0 rows affected (0.018 sec)
 
10.8.0-dbg>INSERT INTO t_link VALUES (1);
Query OK, 1 row affected (0.004 sec)
 
10.8.0-dbg>SELECT * FROM t_link;
+------+
| c    |
+------+
|    1 |
+------+
1 row in set (0.000 sec)

Comment by Sergei Golubchik [ 2022-01-27 ]

Spider can never know that. One can chain spider tables, or create a view or a MyISAM MERGE table or whatever. There are tons of ways to have two partitions to refer to the same table at the end. Two avoid seeing the same row twice one should not create two partitions that refer to the same table.

Comment by Roel Van de Paar [ 2022-01-28 ]

Understood, thank you!

Generated at Thu Feb 08 09:54:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.