[MDEV-12440] CTE not found in a complex SQL code Created: 2017-04-04  Updated: 2017-10-31  Resolved: 2017-04-07

Status: Closed
Project: MariaDB Server
Component/s: Optimizer - CTE
Affects Version/s: 10.2.4
Fix Version/s: 10.2.6

Type: Bug Priority: Major
Reporter: CARME Antoine Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: None
Environment:

debian sid package on x64


Issue Links:
Duplicate
duplicates MDEV-13371 Classic CTE query fails in 10.2.2 Closed
Relates
relates to MDEV-14184 recursive CTE not found Closed

 Description   

Dear all,

I am currently working on a tool to translate machine learning models (built with scikit-learn python library) into a SQL code that can produce the model predictions when executed. The tool works by translating the model into a "sequence" of CTEs. It supports the main relational databases available on the market (sqlite, postgresql, MSSql, Oracle). Some demo is available here (https://github.com/antoinecarme/sklearn2sql-demo/) with sample SQL codes.

MySQL used to not support CTEs. I see that MariaDB is adding support for CTEs and I will be glad to use it and help with examples. This is very good news!!!

I am giving it a try. I use the debian sid (x64) 10.2.4 version and already read some the bug reports filed recently about CTEs.

I use a set of automatic tests that run on all supported databases. I will only report when MariaDB is the only failing case.

So far, a lot of the tested models generate SQL that executes OK on MariaDB which is great!! However, I still have some issues with some complex models (>10 CTEs), I can provide a list.

For example, the AdaBoostClassifier models produce some SQL code that works on postgresql/oracle/MSSQL/SQlite but not on MariaDB. I get a message

"Table \'db.AdaBoost_AVG_Scores\' doesn\'t exist"

AdaBoost_AVG_Scores is a CTE defined in the query. It seems to be a lookup issue for CTEs. the SQL code is available here :
https://github.com/antoinecarme/sklearn2sql-demo/blob/master/sample_outputs_round_4/AdaBoostClassifier/iris/mysql/demo1_AdaBoostClassifier_mysql.sql

the SQL code is based on a table INPUT_DATA that can be defined this way :

create table `INPUT_DATA` AS
(
select 1 as `KEY`, 0.0 as `Feature_0`, 1.0 as `Feature_1`, 2.0 as `Feature_2`, 3.0 as `Feature_3`
)

Your help is welcome.

Sorry for the long message. I tried to give a full bug report with the most details.

Thanks in advance

Antoine



 Comments   
Comment by Igor Babaev [ 2017-04-06 ]

The following simple test case demonstrates this problem:

create table t1 (a int, b  varchar(32));
insert into t1 values
  (4,'aaaa' ), (7,'bb'), (1,'ccc'), (4,'dd');  
 
with
cte1 as (select * from t1 where  b >= 'c'),
cte2 as (select * from cte1 where b < 'b'),
cte3 as (select * from cte2 where a > 1 union select * from cte2 where a > 1)
select * from ct3;

As in the reported query the same CTE in used twice in another CTE and the server returns:

MariaDB [test]> with
    -> cte1 as (select * from t1 where  b >= 'c'),
    -> cte2 as (select * from cte1 where b < 'b'),
    -> cte3 as (select * from cte2 where a > 1 union select * from cte2 where a > 1)
    -> select * from cte3;
ERROR 1146 (42S02): Table 'test.cte1' doesn't exist

Here's another query where the same CTE is used twice: In the main query and in a CTE.

ariaDB [test]> with
    -> cte1 as (select * from t1 where  b >= 'c'),
    -> cte2 as (select * from cte1 where a < 7),
    -> cte3 as (select * from cte2 where a > 1)
    -> select * from cte2, cte3 where cte2.a = cte3.a;
ERROR 1146 (42S02): Table 'test.cte1' doesn't exist

Comment by Igor Babaev [ 2017-04-07 ]

The fix for this bug was pushed into the 10.2 tree.

Comment by Igor Babaev [ 2017-04-07 ]

Hi Antoine,

I executed your query after having applied my fix. It returned some result set, but I could not verify the result, because I did know what to expect.

Comment by CARME Antoine [ 2017-04-07 ]

YEEEES!

That's excellent. Thanks a lot Igor Babaev.

Can't wait or testing the 10.2.6 release !!!!

Comment by CARME Antoine [ 2017-04-07 ]

Igor,

I can do some tests and give you my feedback (in particular, I compare all supported databases result sets, should be the same).

Comment by CARME Antoine [ 2017-06-04 ]

Igor,
Just a message to tell you that I made some tests with 10.2.6 release, and this is a full success. Now I get a test report like this

??
TEST_STATUS AdaBoostClassifier_('sqltm', 'sqlite://')_None success None
TEST_STATUS AdaBoostClassifier_('mssql', 'mssql+pymssql://sa:Micro123@SklearnTestDB/db?charset=utf8&timeout=180')_None success None
TEST_STATUS AdaBoostClassifier_('pgsql', 'postgresql://db:db@localhost/db?port=5432')_None success None
TEST_STATUS AdaBoostClassifier_('mysql', 'mysql://db:db@localhost/db')_None success None
TEST_STATUS AdaBoostClassifier_('oracle', 'oracle+cx_oracle://db:dba@xe')_None success None
??

instead of this

??
TEST_STATUS AdaBoostClassifier_('sqltm', 'sqlite://')_None success None
TEST_STATUS AdaBoostClassifier_('mssql', 'mssql+pymssql://sa:Micro123@SklearnTestDB/db?charset=utf8')_None success None
TEST_STATUS AdaBoostClassifier_('pgsql', 'postgresql://db:db@localhost/db?port=5432')_None success None
TEST_STATUS AdaBoostClassifier_('mysql', 'mysql://db:db@localhost/db')_None failure ProgrammingError_mysql_exceptions.ProgrammingError) (1146, "Table 'db.AdaBoost_AVG_Scores' doesn't exist") [SQL: "WITH `DT_node_lookup` AS \n(SELECT `ADS`.`index` AS `index`, CASE WHEN (`ADS`.`XD_10` <= -0.751862764
TEST_STATUS AdaBoostClassifier_('oracle', 'oracle+cx_oracle://db:db@xe')_None success None
??

Good job, guys. You made a happy MariaDB user.

Cheers
Antoine

Generated at Thu Feb 08 07:57:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.