The ultimate objetive is, for example, to trigger the execution of a remote Python/R Code for updating a predictive model in keras/h2o/tensorflow/... when new data is introduced into the database, or as part of a SQL procedure.
For using the "CONNECT REMOTE_TABLE" Engine, the user will have to create an empty table as follows:
CREATE [OR REPLACE] [IF NOT EXISTS] TABLE tbl_name_connect
ENGINE = CONNECT
TABLE_TYPE = REMOTE_TABLE
LANGUAGE = {R|Python}
[PARALLEL={0|1}]
[WITH VERSION CONTROL];
2.1.3. Call a Script: SELECT
There will be 3 main ways of calling a R / Python Script from MariaDB:
A. To Import an R / Python Script on the fly an call it from MariaDB Server:
SELECT
LOAD_FILE('/path/.../script_file_name.txt'))
FROM
tbl_name_connect;
B. To write a Script directly in the IDE and call it from MariaDB Server:
SELECT
QUOTE("... [R or Python Script] ...")
FROM
tbl_name_connect;
C. To call a previously saved R / Python Script into tbl_name:
SELECT
(SELECT Col_Script FROM tbl_name_script_repository LIMIT 1;)
FROM
tbl_name_connect;
2.2 Internals
Connect will:
Send the Script Text to the .exe file.
Retrieve a data.frame Table (Last Evaluated Object or return() Object), or retrieve any error / warnings and if the execution was complete.
An example R Script to execute in the RScript.exe, which makes use of an ODBC Connection, would be:
library(tidyverse)
library(DBI)
library(RMariaDB)
con <- dbConnect(RMariaDB::MariaDB(), group = "my-db")
dbWriteTable(con, "mtcars", mtcars)
dbReadTable(con, "mtcars")
dbDisconnect(con)
2.3. Security Issues
This new feature could imply security issues, so:
It shall be disabled by default.
Have its own USER security permission.
MariaDB to include its own trusted distributions of R and Python.
Also another solution would be to have to force the users to store the Python and R .exe directly in the MariaDB installation folders, so that greater care of such executable files can be taken in the system.
3. OPTION B: Through a Stored Procedure
This new feature could imply security issues, so:
It shall be disabled by default.
Have its own USER security permission.
MariaDB to include its own trusted distributions of R and Python.
Also another solution would be to have to force the users to store the Python and R .exe directly in the MariaDB installation folders, so that greater care of such executable files can be taken in the system.
4. Collaboration
I could ask the maintainer of the R "RMariaDB" package (Developed in C++ and R) in case he would be interested in collaborate in this project.
Attachments
Issue Links
causes
MDEV-16453TRANSACT-SQL Port: Call External R & Python Code from MariaDB: Stored Procedure
To: Juan
Can you elaborate by giving examples. For instance what it the script name for as its text is given? Could it be also a script file whose path would be given. SQL server requires the language to be specified, how about CONNECT?
Actually, I am not sure I understand what you're doing. When creating a table, in all DBMSs, the column specifications describe the returned result. This is what SQL Server does. So would all specifications of the script to execute be contained in options, not in columns?
Olivier Bertrand
added a comment - To: Juan
Can you elaborate by giving examples. For instance what it the script name for as its text is given? Could it be also a script file whose path would be given. SQL server requires the language to be specified, how about CONNECT?
Actually, I am not sure I understand what you're doing. When creating a table, in all DBMSs, the column specifications describe the returned result. This is what SQL Server does. So would all specifications of the script to execute be contained in options, not in columns?
As I previously said, in a create table, columns definition is meant to describe the returned result and cannot be used for other purpose.
Another problem is that all this is based on CONNECT executing an exe program. I don't know if it is possible and, even it is, it probably cannot be permitted for security reasons.
Perhaps doing this can be achieve using a procedure, as SQL server does, but I don't see how CONNECT could do it, for instance how to retrieve data from the script?
Olivier Bertrand
added a comment - As I previously said, in a create table, columns definition is meant to describe the returned result and cannot be used for other purpose.
Another problem is that all this is based on CONNECT executing an exe program. I don't know if it is possible and, even it is, it probably cannot be permitted for security reasons.
Perhaps doing this can be achieve using a procedure, as SQL server does, but I don't see how CONNECT could do it, for instance how to retrieve data from the script?
SQL server requires the language to be specified, how about CONNECT?
CONNECT would also need the language to be specified: 'R' or 'Python'
So would all specifications of the script to execute be contained in options, not in columns?
In my example, I've moved to my.ini options some configuration settings for greater sequrity.
As I previously said, in a create table, columns definition is meant to describe the returned result and cannot be used for other purpose.
In SQL Server, it is possible to return a Pandas DataFrame, and in R a data.frame, a tibble, and a data.table I guess. Also, it would be nice to return an empty table only with warnings / errors / etc. for remote execution without retrieving any results.
Another problem is that all this is based on CONNECT executing an exe program. I don't know if it is possible and, even it is, it probably cannot be permitted for security reasons.
It is also a concern for me, how does SQL Server solve it?
for instance how to retrieve data from the script?
Juan Telleria
added a comment - - edited Can you elaborate by giving examples.
Done.
SQL server requires the language to be specified, how about CONNECT?
CONNECT would also need the language to be specified: 'R' or 'Python'
So would all specifications of the script to execute be contained in options, not in columns?
In my example, I've moved to my.ini options some configuration settings for greater sequrity.
As I previously said, in a create table, columns definition is meant to describe the returned result and cannot be used for other purpose.
In SQL Server, it is possible to return a Pandas DataFrame, and in R a data.frame, a tibble, and a data.table I guess. Also, it would be nice to return an empty table only with warnings / errors / etc. for remote execution without retrieving any results.
Another problem is that all this is based on CONNECT executing an exe program. I don't know if it is possible and, even it is, it probably cannot be permitted for security reasons.
It is also a concern for me, how does SQL Server solve it?
for instance how to retrieve data from the script?
We could request collaboration for R maybe to Kirill Müller, creator of package RMariaDB ( https://cran.r-project.org/web/packages/RMariaDB/index.html ), maybe he will be open for collaboration, or at least, give some advice.
Thank you
From your description, this looks like a new feature of MariaDB, as it is for SQL Server. However, why this would be implemented by a storage engine? This request should be directly addressed to MariaDB development. Unless there is something proving that only CONNECT can be used to implement it, I won't study how to add it to CONNECT, because it seems to be too a complicate thing to do, and mainly because CONNECT being now a GA product, adding this would regress it to a beta availability product.
Olivier Bertrand
added a comment - From your description, this looks like a new feature of MariaDB, as it is for SQL Server. However, why this would be implemented by a storage engine? This request should be directly addressed to MariaDB development. Unless there is something proving that only CONNECT can be used to implement it, I won't study how to add it to CONNECT, because it seems to be too a complicate thing to do, and mainly because CONNECT being now a GA product, adding this would regress it to a beta availability product.
Maybe it shall have its own storage engine (Alpha), and for greater sequrity, this storage engine shall come with its own distribution of R / Python for greater safety.
Mmmm... Thank you however, we at least tried
Juan Telleria
added a comment - Maybe it shall have its own storage engine (Alpha), and for greater sequrity, this storage engine shall come with its own distribution of R / Python for greater safety.
Mmmm... Thank you however, we at least tried
Juan Telleria
added a comment - - edited However, why this would be implemented by a storage engine? This request should be directly addressed to MariaDB development.
Have opened a new issue: MDEV-16453
Do you know who in the MariaDB Development Team could reconsider this implementation with another focus?
Could you re-assign the issue (if appropriate)?
Thank you.
Juan
Juan Telleria
added a comment - Do you know who in the MariaDB Development Team could reconsider this implementation with another focus?
Could you re-assign the issue (if appropriate)?
Thank you.
Juan
To: Juan
Can you elaborate by giving examples. For instance what it the script name for as its text is given? Could it be also a script file whose path would be given. SQL server requires the language to be specified, how about CONNECT?
Actually, I am not sure I understand what you're doing. When creating a table, in all DBMSs, the column specifications describe the returned result. This is what SQL Server does. So would all specifications of the script to execute be contained in options, not in columns?