I am currently working on a tool to translate machine learning models (built with scikit-learn python library) into a SQL code that can produce the model predictions when executed. The tool works by translating the model into a "sequence" of CTEs. It supports the main relational databases available on the market (sqlite, postgresql, MSSql, Oracle). Some demo is available here (https://github.com/antoinecarme/sklearn2sql-demo/) with sample SQL codes.
MySQL used to not support CTEs. I see that MariaDB is adding support for CTEs and I will be glad to use it and help with examples. This is very good news!!!
I am giving it a try. I use the debian sid (x64) 10.2.4 version and already read some the bug reports filed recently about CTEs.
I use a set of automatic tests that run on all supported databases. I will only report when MariaDB is the only failing case.
So far, a lot of the tested models generate SQL that executes OK on MariaDB which is great!! However, I still have some issues with some complex models (>10 CTEs), I can provide a list.
For example, the AdaBoostClassifier models produce some SQL code that works on postgresql/oracle/MSSQL/SQlite but not on MariaDB. I get a message
"Table \'db.AdaBoost_AVG_Scores\' doesn\'t exist"
AdaBoost_AVG_Scores is a CTE defined in the query. It seems to be a lookup issue for CTEs. the SQL code is available here :
the SQL code is based on a table INPUT_DATA that can be defined this way :
create table `INPUT_DATA` AS
select 1 as `KEY`, 0.0 as `Feature_0`, 1.0 as `Feature_1`, 2.0 as `Feature_2`, 3.0 as `Feature_3`
Your help is welcome.
Sorry for the long message. I tried to give a full bug report with the most details.
Thanks in advance