[MDEV-28395] LOAD DATA transfer plugins Created: 2022-04-22  Updated: 2024-01-22

Status: Open
Project: MariaDB Server
Component/s: Server
Fix Version/s: None

Type: New Feature Priority: Major
Reporter: Sergei Golubchik Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Blocks
blocks MCOL-5013 Support Load data from AWS S3 : UDF... Closed
Relates
relates to MDEV-33188 Enhance mariadb-dump and mariadb-impo... Open
relates to MXS-4618 Load data from S3 Closed

 Description   

LOAD DATA currently supports reading from files and from the client (LOAD DATA LOCAL).

There's a certain interest in the ability to load data from more sources, in particular from AWS S3, but also from http[s]. This could also enable us to handle compressed files that is using any of the compression format the server supports.

We'll solve it by abstracting file reading code into the plugin. Initially there will be two plugins, file and "local".

The syntax

LOAD { DATA | XML } [ LOCAL ] INFILE ...

will be generalized to

LOAD { DATA | XML } [ plugin ] INFILE ...

We might need some kind of plugin-specific syntax extension for LOAD, so that AWS plugin would be able to specify the credentials. Or may be not, if everything can be part of the "filename", like in http://user:password@host.name/path/to/file

Preferably it should work for SELECT ... INTO OUTFILE too.



 Comments   
Comment by Sergei Golubchik [ 2024-01-22 ]

strictly speaking, there're three kinds of actions here.

  • fetching the raw data from somewhere, file, client, aws, http, whatever
    • input: source specification, e.g. url
    • output: data stream
  • filtering the data (e.g. uncompress)
    • input: raw data
    • output: raw data
  • parsing the data, e.d. XML, CSV, etc
    • input: raw data
    • output: column values
Generated at Thu Feb 08 10:00:24 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.