[MDEV-30649] ALTER TABLE ENGINE doesn't copy data from SPIDER tables Created: 2023-02-13 Updated: 2023-11-14 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - Spider |
| Affects Version/s: | 10.6.11 |
| Fix Version/s: | 10.6 |
| Type: | Bug | Priority: | Major |
| Reporter: | Vincent Milum Jr | Assignee: | Yuchen Pei |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
If I use the CONNECT engine to federate data from another table, and then issue something like the following on the CONNECT table:
When converting from CONNECT to InnoDB, both the schema and table contents will be copied. If I instead use the SPIDER engine to federate the access, only the schema definition is modified with the previous SQL statement, but no data is copied into the local table, instead the table remain empty. My particular intent is to archive legacy data, moving it from one physical server to another after it has aged out. I would simply use the CONNECT engine itself, but it fails to support SETs, ENUMs, and NULLable indexes. |
| Comments |
| Comment by Vincent Milum Jr [ 2023-02-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Going from SPIDER to S3, neither of these work
| |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yuchen Pei [ 2023-03-01 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
mtr testcase
| |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yuchen Pei [ 2023-11-08 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Right ralf.gebhardt, this does remind me of a comment in MDEV-16967 But yeah, the effect of operations on spider tables should be | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yuchen Pei [ 2023-11-08 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
It's intentionally skipped in the spider code
Trace during the alter: ha_spider::rnd_next_internal > ha_spider::rnd_next > rr_sequential > READ_RECORD::read_record > mysql_alter_table > mysql_execute_command > mysql_parse > do_command > do_handle_one_connection > pfs_spawn_thread Blame gives this 10-year old big commit
Commenting out this check and the alter table does copy the data in | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2023-11-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
It's just illogical. if you do ALTER TABLE from Spider to, say, InnoDB and have all the data locally, then all data must've been deleted from remote servers. Spider doesn't create or drop remote tables, it's by design. So it cannot copy all data into the local table with ALTER. I think it would made a lot of sense to have some kind of a special mode where a spider would push down DDLs. When a local CREATE TABLE ENGINE=SPIDER would create a table on the remote server, when a local DROP would drop a remote table too. And in that mode ALTER TABLE should copy the data, indeed. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vincent Milum Jr [ 2023-11-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Its only illogical until it isn't. By that argument, you shouldn't even be allowed to ALTER TABLE ENGINE on a SPIDER table. And going back to the original example, the data copy works exactly as expected for CONNECT engine. So why one federated engine type, but not another? I would use CONNECT if not for all the other ways its broken (ENUM/SET, etc), and not supporting clustering/sharding. The idea is to archive data, so going from a federated engine like SPIDER directly into the S3 engine, bypassing InnoDB entirely. If going to InnoDB or Aria locally first, then a second ALTER is needed, plus double the temporary storage while the S3 data is generated locally before uploading. The entire point is to have a worker MariaDB node act as a data warehouse archival node, pulling data from a larger cluster, then dumping it into S3 when it hits a legacy "read-only" state. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yuchen Pei [ 2023-11-10 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
> I think it would made a lot of sense to have some kind of a special mode where a spider would push down DDLs. When a local CREATE TABLE ENGINE=SPIDER would create a table on the remote server, when a local DROP would drop a remote table too. And in that mode ALTER TABLE should copy the data, indeed. Makes sense. DML operations are unambiguous when operating on a spider So dropping a spider table have two meanings: drop the local table But alter table changing the engine say to innodb is even more And the request is this ticket is asking for yet a 4th meaning: alter In any case first step would be confirming this with some documentation, as | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2023-11-12 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
darkain, I understand your use case, it's a perfectly valid one. I only said that in the current Spider logic ALTER TABLE is the wrong command for that. COPY TABLE would've been a better one, if MariaDB had it. Or this new Spider mode where DDL's would apply to remote tables, which we don't have either. But anyway, of course, there must be some solution for your use case, one way or another. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yuchen Pei [ 2023-11-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
BTW spider has a udf called spider_copy_tables that copies data https://mariadb.com/kb/en/spider_copy_tables/ All the src and dest tables are on the data nodes. For example, say ta_l is a spider table with two remotes, then SELECT spider_copy_tables('ta_l', '0', '1') copies data from the 0th link to the 1st link. This udf should satisfy the requirement in the ticket, where the If we want to keep the functionality of copying from one possibly {{SELECT spider_copy_tables(t1, t2)} where t1 and t2 could be an arbitrary combination of spider and Special push down DDL mode suffers from ambiguity mentioned in my alter table t1 engine=innodb could mean alter the spider table t1 to an innodb table and move |