[MDEV-16453] TRANSACT-SQL Port: Call External R & Python Code from MariaDB: Stored Procedure Created: 2018-06-09  Updated: 2018-06-13

Status: Open
Project: MariaDB Server
Component/s: Server
Fix Version/s: None

Type: Task Priority: Major
Reporter: Juan Telleria Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: TRANSACT-SQL

Issue Links:
Problem/Incident
is caused by MDEV-16418 CONNECT Engine - Feature Request: Rem... Closed

 Description   

1. Objective

To give the possibility to call the execution of external:

Scripts directly from MariaDB, just as SQL Server, from 2017 Edition onward, allows to do so:

And PostgreSQL so does for Python:

The ultimate objetive is, for example, to trigger the execution of a remote Python/R Code for updating a predictive model in keras/h2o/tensorflow/... when new data is introduced into the database, or as part of a SQL procedure.

2. Implementation: Using a Stored Procedure

It would be a stored procedure that could come by default in the "mysql" database, just as some other database management procedures come with the database instance.

And it could "copy" how the SQL Server's Stored Procedure works, with MariaDB's particularities:

  1. Copy the Script (TEXT) from MariaDB.
  2. Send it to the trusted .exe of R (RScript.exe) or Python (python.exe)
  3. Return a Pandas DataFrame (Python) or a data.table, data.frame, tibble (R), if anything returned, plus warnings and errors.

3. Security Issues

This new feature could imply security issues, so:

  • It shall be disabled by default.
  • Have its own USER security permission.
  • MariaDB to include its own trusted distributions of R and Python.

Also another solution would be to have to force the users to store the Python and R .exe directly in the MariaDB installation folders, so that greater care of such executable files can be taken in the system.



 Comments   
Comment by Juan Telleria [ 2018-06-09 ]

I included in the issue description links to PostgreSQL implementation, that because of being Open Source it could probably be ported (partially) to MariaDB.

Comment by Sergei Golubchik [ 2018-06-11 ]

You might be interested to look at third-party project https://github.com/mysqludf/lib_mysqludf_sys — in particular at mysql_sys_exec() UDF function.

Comment by Juan Telleria [ 2018-06-11 ]

Could we use this library for developing a solution R and Python specific for MariaDB of which everyone can benefit?

This will allow strong foundations for seamless integration of Machine Learning and Statistical Libraries within MariaDB itself.

Comment by Sergei Golubchik [ 2018-06-13 ]

A solution of which everyone can benefit? Yes, sure.

But to allow users to execute arbitrary commands via shell is a rather big security issue, so we're unlikely to accept this solution as a contribution or distribute it from mariadb.org.

Users can still benefit from it as a third-party solution and you can, of course, list it on the appropriate Knowledge Base pages.

Long term we'd like to look into pluggable languages for stored routines. But it is not a priority for 10.4 at the moment.

Comment by Juan Telleria [ 2018-06-13 ]

Thank you Sergei.

What you stated sounds more than reasonable to me.

As regards pluggable languages for stored routines, maybe it could be left for MariaDB 10.5, but I would definitely include it in the long term roadmap.

Here is the PostgreSQL solution for example:
https://github.com/fdr/pg-python?files=1

Thank you.

Generated at Thu Feb 08 08:29:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.