[MDEV-23449] alias do not exist and a query do not report an error Created: 2020-08-11  Updated: 2021-06-30  Resolved: 2021-02-01

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.1, 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.2.38, 10.3.29, 10.4.19, 10.5.10

Type: Bug Priority: Major
Reporter: Massimo Assignee: Varun Gupta (Inactive)
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Problem/Incident
causes MDEV-24779 main.subselect fails in buildbot with... Closed

 Description   

There is a complex query where there is a internal subquery with no alias reported anywhere and there is not error fire once the query run.

Tested a simple case:

MariaDB [test]> select user from mysql.user m where user in (select user from mysql.user m2 group by s.id);
+---------------+
| user          |
+---------------+
| data_engineer |
| ro1           |
 
as you see there is no alias for 's' table.
 
MariaDB [test]> explain extended select user from mysql.user m where user in (se
lect user from mysql.user m2 group by s.id);
+------+--------------+-------------+-------+---------------+---------+---------
+------+------+----------+------------------------------------------------------
--------+
| id   | select_type  | table       | type  | possible_keys | key     | key_len
| ref  | rows | filtered | Extra
        |
+------+--------------+-------------+-------+---------------+---------+---------
+------+------+----------+------------------------------------------------------
--------+
|    1 | PRIMARY      | <subquery2> | ALL   | distinct_key  | NULL    | NULL
| NULL |   20 |   100.00 |
        |
|    1 | PRIMARY      | m           | index | NULL          | PRIMARY | 420
| NULL |   20 |   100.00 | Using where; Using index; Using join buffer (flat, BN
L join) |
|    2 | MATERIALIZED | m2          | index | NULL          | PRIMARY | 420
| NULL |   20 |   100.00 | Using index
        |
+------+--------------+-------------+-------+---------------+---------+---------
+------+------+----------+------------------------------------------------------
--------+
3 rows in set, 1 warning (0.023 sec)
MariaDB [test]> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `mysql`.`m`.`User` AS `user` from `mysql`.`user` `m` semi join (
`mysql`.`user` `m2`) where `mysql`.`m`.`User` = `mysql`.`m2`.`User`
1 row in set (0.001 sec)

So there is no alias and it should fire an error.



 Comments   
Comment by Varun Gupta (Inactive) [ 2020-08-13 ]

Some context:

There are subquery rewrites done mostly for IN/ALL/ANY/EXISTS subquery at the prepare stage.

The rewrites are done in these 2 functions:
1) remove_redundant_subquery_clauses

  • This function removes the GROUP BY clause and DISTINCT from IN/ALL/ANY/EXISTS subquery.
  • This function is called before fix_fields is called by the GROUP BY clause

2) check_and_do_in_subquery_rewrites

  • This function removes the ORDER BY clause from IN/ALL/ANY subquery.
  • This function also picks the strategy for execution of the subquery (materialization or IN->EXISTS conversion)
  • This function is called after fix_fields is called for ORDER BY clause.
Comment by Varun Gupta (Inactive) [ 2020-08-13 ]

Lets take an example

SELECT * FROM t2 WHERE t2.a IN (SELECT t1.b FROM t1 GROUP by t1.a);

First such queries are not allowed with sql_mode=ONLY_FULL_GROUP_BY.
But lets cover the cases when the query is allowed.
First thing we need to do is resolve the GROUP BY clause. This would check the validity of the query and if the query is invalid (like the case in the description), an error will be thrown.

But after the columns are resolved then there is no need of the grouping operation.

So to frame it better:
For a query like:

SELECT t1.b FROM t1 GROUP by t1.a

this means, for each GROUP-BY-group of t1.a get some value of t1.b
This is the semantics of the subquery.

Now extending this for the query

SELECT * FROM t2 WHERE t2.a IN (SELECT t1.b FROM t1 GROUP by t1.a);

for each row of t2:
get a value of t2.a
if none of the GROUP-BY groups have a value of t1.b that's equal to t2.a (no matches ever)
then we must return false.
otherwise, if every row in every group matches, then we must return true.
if there are both matches and non-matches - we may return either true or false
if the table is empty, return false.

so with this the semantics of the query is the same as this query

SELECT * FROM t2 WHERE t2.a IN (SELECT t1.b FROM t1);

Comment by Michael Widenius [ 2021-01-29 ]

Patch ok. I have just asked Varun for a small update to the commit message to make more clear why it's safe to move the code over this if:
mixed_implicit_grouping= false;
if ((~thd->variables.sql_mode & MODE_ONLY_FULL_GROUP_BY) &&
select_lex->with_sum_func && !group_list)
{

Comment by Michael Widenius [ 2021-01-29 ]

ok to push

Generated at Thu Feb 08 09:22:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.