[MDEV-11988] Some problems with the MariaDB parser Created: 2017-02-03  Updated: 2017-05-25  Resolved: 2017-05-25

Status: Closed
Project: MariaDB Server
Component/s: Parser
Fix Version/s: N/A

Type: Task Priority: Minor
Reporter: Olivier Bertrand Assignee: Alexander Barkov
Resolution: Won't Fix Votes: 0
Labels: None


 Description   

The IN where clause fails when used with a user variable. Example:

SET @Y = "'UK','USA'";
SELECT * FROM customer WHERE country in (@Y );

Returns 0 lines. The problem is that MariaDB, wrongly thinking the IN clause contains only one item transforms it in an EQUAL clause that fails.

Comparison operators modified by ALL, ANY, or SOME are only accepted with sub-select's. According to SQL they should by also accepted with list of values. For instance:

SELECT 5 > ANY (4,5,6);

This raises a syntax error; should not.



 Comments   
Comment by Elena Stepanova [ 2017-02-06 ]

The first result doesn't look wrong to me – naturally it assumes that IN contains one item, as @Y is not a list, it's just a string containing letters, commas, spaces and single-quote marks.
Of course, maybe SQL standard says something different about it, it doesn't always go with "natural".

Anyway, assigning the whole thing to bar as a parser expert for further decisions.

Comment by Alexander Barkov [ 2017-02-06 ]

The first result doesn't look wrong look for me either. I agree with Elena.

Comment by Alexander Barkov [ 2017-02-06 ]

ANY with a list instead of a subquery is not supported yet.

Comment by Olivier Bertrand [ 2017-02-07 ]

In my example, replacing the user variable by its value, the query should be:

SELECT * FROM customer WHERE country in ('UK','USA');

And this should work. However it is transformed in:

SELECT * FROM customer WHERE country = 'UK','USA';

And this fails. It is a parser decision. Regarding variables as one item or replacing them by their value before parsing. The first is logical. The second would be a mean to construct large queries when the number of values is big. But perhaps this could also be done in a procedure.

Comment by Alexander Barkov [ 2017-02-07 ]

Olivier, @Y is an expression, it returns the value of the user variable Y!
It is not a pre-processor substitution directive in any ways!
This is well documented in the manual. See the manual sections about user variables.

The IN list contains expressions. The IN predicant is compared to values in the list. In case of your example, the predicant is compared to the value of @Y. Everything works as expected.

What you're trying to do can be done using dynamic SQL.

In pre-10.2 you can use this:

CREATE OR REPLACE TABLE customer (country TEXT);
INSERT INTO customer VALUES ('UK'),('USA'),('DE');
SET @list="'UK','USA'";
SET @query=CONCAT('SELECT * FROM customer WHERE country in (',@list,')');
PREPARE stmt FROM @query;
EXECUTE stmt;
DEALLOCATE stmt;

Starting from 10.2, dynamic SQL was extended to understand EXECUTE IMMEDIATE, so now there is a simpler way:

CREATE OR REPLACE TABLE customer (country TEXT);
INSERT INTO customer VALUES ('UK'),('USA'),('DE');
SET @list="'UK','USA'";
EXECUTE IMMEDIATE CONCAT('SELECT * FROM customer WHERE country in (',@list,')');

Comment by Olivier Bertrand [ 2017-02-07 ]

Thank you Alexander for these explanations.

My request was related to MDEV-11832. For external tables, WHERE clause using a sub-select cannot be sent to the remote server, which can be a problem with very large tables.

As you explain, this could be done by getting the sub-select result list in a variable:

SELECT GROUP_CONCAT(DISTINCT country SEPARATOR ',') INTO @list FROM sub-select_statement;

Unfortunately, this works only for numeric columns. With string columns, the group_concat function should be added a parameter specifying a quoting character in addition to the separator one.

Is this feasible?

Perhaps this can be also done specifying:

SELECT GROUP_CONCAT(DISTINCT country SEPARATOR '\',\'') INTO @list FROM sub-select_statement;
SET @query=CONCAT('SELECT * FROM customer WHERE country in (\'',@list,'\')');

Comment by Alexander Barkov [ 2017-02-08 ]

You can try the QUOTE() function:
https://mariadb.com/kb/en/mariadb/quote/

Something like this:
SELECT GROUP_CONCAT(DISTINCT QUOTE(country) SEPARATOR ',') FROM customer;

But the problem is that GROUP_CONCAT() is limited to cut the result if it gets longer than group_concat_max_len bytes, which defaults to 1024 bytes.
See here for details:
https://mariadb.com/kb/en/mariadb/group_concat/

So GROUP_CONCAT() is fine for relatively short lists.

You fetch the results of SQL queries from inside the Connect engine plugin, right?
So perhaps you can reset the value of thd->variables.group_concat_max_len to a bigger value before sending this query, and restore it back after.

Note, there is also a limit on the maximum query size:
https://mariadb.com/kb/en/mariadb/server-system-variables/#max_allowed_packet
So before sending the query to a remote server, you'll need to make sure it's not longer than @@max_allowed_packet on this remote server.

It seems that, instead of using GROUP_CONCAT(), it should be a manual loop fetching values from this query:

SELECT DISTINCT QUOTE(country) FROM customer;

collecting the values into a String variable and sending queries to the remote server in parts:
when the length of this String grows close to @@max_allowed_packet of the remote server, it should send the query to the remote server using IN with the currently collected list, then reset the list in the String variable, and continue fetching.

Comment by Alexander Barkov [ 2017-02-08 ]

Olivier, I have a question about quantified comparison operators.

You said this is a standard syntax:

a > ANY(1,2,3);

I could not find any proves that this is a standard query. According to SQL:2011, ANY, SOME and ALL can accept only a <table subquery> as an argument. So the closest SQL:2011 compliant query would be:

a > ANY(VALUES(1,2,3));

Where VALUES(1,2,3) is a subquery covered by this grammar:

<table value constructor> ::= VALUES <row value expression list>

Which standard version do you mean? Can you please quote the relevant grammar?

Thanks!

Generated at Thu Feb 08 07:54:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.