[MDEV-24837] HAVING clause yields wrong results in 10.2 and 10.3, fine in 10.1 Created: 2021-02-10  Updated: 2023-11-28

Status: In Review
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.2.26, 10.3.12, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Critical
Reporter: Claudio Nanni Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 1
Labels: 11.0-sel

Attachments: Text File testcase-having-bug-10.2onward.txt    

 Description   

As you can see in the attached test case a query behaves differently in 10.1 vs 10.2 and 10.3

10.1

select distinct concat(table1.T1_VAL,table1.T1_VAR) as CC, concat(table1.T1_VAL,table1.T1_VAR) = 'VALB' as chk, group_concat(table2.T3_ID) as id from test.table1 LEFT JOIN (test.table2,test.table3) on concat(T1_VAL,T1_VAR) = table2.T2_VALVAR and table2.T3_ID = table3.T3_ID group by T1_ID having CC = 'VALB';
+------+-----+------+
| CC   | chk | id   |
+------+-----+------+
| VALB |   1 | 1    |
+------+-----+------+
1 row in set (0.00 sec)

10.2 & 10.3

select distinct concat(table1.T1_VAL,table1.T1_VAR) as CC, concat(table1.T1_VAL,table1.T1_VAR) = 'VALB' as chk, group_concat(table2.T3_ID) as id from test.table1 LEFT JOIN (test.table2,test.table3) on concat(T1_VAL,T1_VAR) = table2.T2_VALVAR and table2.T3_ID = table3.T3_ID group by T1_ID having CC = 'VALB';
Empty set (0.01 sec)

– No Results –



 Comments   
Comment by Alice Sherepa [ 2021-02-10 ]

Repeatable on 10.2-10.5, with MyIsam/InnoDB.

create table t1 (t1_id int, t1_val varchar(20) , t1_var varchar(100) );
insert into t1 values (1,'val','a'),(2,'val','b'),(3,'val','c');
 
create table t2 (t2_id int, t2_valvar varchar(20), t3_id int );
insert into t2 values (1,'valb',1);
 
create table t3 (t3_id int primary key, t3_lib varchar(20));
insert into t3 values (1,'test');
 
select distinct concat(t1.t1_val,t1.t1_var) as cc, group_concat(t2.t3_id)
from t1
	left join (t2, t3) on concat(t1_val,t1_var) = t2.t2_valvar and t2.t3_id = t3.t3_id
group by t1_id 
having cc = 'valb';

MariaDB [test]> explain extended select distinct concat(t1.t1_val,t1.t1_var) as cc, group_concat(t2.t3_id) from t1 left join (t2, t3) on concat(t1_val,t1_var) = t2.t2_valvar and  t2.t3_id = t3.t3_id group by t1_id  having cc = 'valb';
+------+-------------+-------+--------+---------------+---------+---------+---------------+------+----------+---------------------------------+
| id   | select_type | table | type   | possible_keys | key     | key_len | ref           | rows | filtered | Extra                           |
+------+-------------+-------+--------+---------------+---------+---------+---------------+------+----------+---------------------------------+
|    1 | SIMPLE      | t1    | ALL    | NULL          | NULL    | NULL    | NULL          | 3    |   100.00 | Using temporary; Using filesort |
|    1 | SIMPLE      | t2    | ALL    | NULL          | NULL    | NULL    | NULL          | 1    |   100.00 | Using where                     |
|    1 | SIMPLE      | t3    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.t3_id | 1    |   100.00 | Using index                     |
+------+-------------+-------+--------+---------------+---------+---------+---------------+------+----------+---------------------------------+
3 rows in set, 1 warning (0.008 sec)
 
Note (Code 1003): select distinct concat(`test`.`t1`.`t1_val`,`test`.`t1`.`t1_var`) AS `cc`,group_concat(`test`.`t2`.`t3_id` separator ',') AS `group_concat(t2.t3_id)` from `test`.`t1` left join (`test`.`t2` join `test`.`t3`) on(`test`.`t3`.`t3_id` = `test`.`t2`.`t3_id` and concat(`test`.`t1`.`t1_val`,`test`.`t1`.`t1_var`) = `test`.`t2`.`t2_valvar` and `test`.`t2`.`t3_id` is not null) where 1 group by `test`.`t1`.`t1_id` having 1

Comment by Igor Babaev [ 2021-08-13 ]

Alexander,
I don't think we have to guarantee the same behavior for the queries that are not valid SQL queries. In the above test case grouping is by t1_id while concat(t1.t1_val,t1.t1_var) is not functionally dependent on t1_id. So the query is basically non-deterministic.
A possible work-around is to set all optimizer switches that appeared after 10.1 to 'off'.

Comment by Alexander Keremidarski [ 2021-08-16 ]

Considering that such query is non-deterministic and MariaDB rejects it with sql_mode flag ONLY_FULL_GROUP_BY shouldn't we close MDEV-24837 as "Not a bug"?

Comment by Sergei Golubchik [ 2021-08-23 ]

I don't see why it's non-deterministic yet. what optimizer switch causes it?

Comment by Sergei Petrunia [ 2022-04-25 ]

Igor wrote

In the above test case grouping is by t1_id while concat(t1.t1_val,t1.t1_var) is not functionally dependent on t1_id.

This is correct. So, the query is not accepted with sql_mode=ONLY_FULL_GROUP_BY.
However, the query output is deterministic given MariaDB (or MySQL's) extended group by usage.

Look at the resultset before the grouping:

select
  table1.*,
  table2.*,
  table3.*,
  concat(table1.T1_VAL,table1.T1_VAR) as CC, 
  concat(table1.T1_VAL,table1.T1_VAR) = 'VALB' as chk
from 
  test.table1 
  LEFT JOIN (test.table2,test.table3) on 
    concat(T1_VAL,T1_VAR) = table2.T2_VALVAR and 
    table2.T3_ID = table3.T3_ID;

+-------+--------+--------+-------+-----------+-------+-------+--------+------+------+
| T1_ID | T1_VAL | T1_VAR | T2_ID | T2_VALVAR | T3_ID | T3_ID | T3_LIB | CC   | chk  |
+-------+--------+--------+-------+-----------+-------+-------+--------+------+------+
|     1 | VAL    | A      |  NULL | NULL      |  NULL |  NULL | NULL   | VALA |    0 |
|     2 | VAL    | B      |     1 | VALB      |     1 |     1 | TEST   | VALB |    1 |
|     3 | VAL    | C      |  NULL | NULL      |  NULL |  NULL | NULL   | VALC |    0 |
+-------+--------+--------+-------+-----------+-------+-------+--------+------+------+

The query has

group by T1_ID 
  having CC = 'VALB';

That is, each GROUP BY group has one row, and the HAVING will select one row (and group) with CC='VALB'.

If one doesn't use DISTINCT, it will actually do that:

select
  concat(table1.T1_VAL,table1.T1_VAR) as CC, 
  concat(table1.T1_VAL,table1.T1_VAR) = 'VALB' as chk,
  group_concat(table2.T3_ID) as id 
from 
  test.table1 
  LEFT JOIN (test.table2,test.table3) on 
    concat(T1_VAL,T1_VAR) = table2.T2_VALVAR and 
    table2.T3_ID = table3.T3_ID 
group by T1_ID 
  having CC = 'VALB';

+------+------+------+
| CC   | chk  | id   |
+------+------+------+
| VALB |    1 | 1    |
+------+------+------+

But if I add DISTINCT, the row is gone:

select distinct
  concat(table1.T1_VAL,table1.T1_VAR) as CC, 
  concat(table1.T1_VAL,table1.T1_VAR) = 'VALB' as chk,
  group_concat(table2.T3_ID) as id 
from 
  test.table1 
  LEFT JOIN (test.table2,test.table3) on 
    concat(T1_VAL,T1_VAR) = table2.T2_VALVAR and 
    table2.T3_ID = table3.T3_ID 
group by T1_ID 
  having CC = 'VALB';

Empty set (0.001 sec)

Comment by Oleg Smirnov [ 2022-04-27 ]

Another observation: the following variation of SQL causes server crash:

select concat(t1.T1_VAL,t1.T1_VAR) as CC
from t1 LEFT JOIN (t2, t3) on
    concat(T1_VAL,T1_VAR) = t2.T2_VALVAR and t2.T3_ID = t3.T3_ID;

Comment by Sergei Petrunia [ 2022-04-28 ]

oleg.smirnov, following the discussion on the optimizer call: as the crash is a separate issue, please file a separate self-contained MDEV for it, mention its number here, assign it to yourself.

Comment by Oleg Smirnov [ 2022-04-29 ]

Some other observations so far.
The bug reveals itself only when both conditions are satisfied:

  • group_concat(t2.T3_ID) is present in the field list
  • t2.T3_ID = t3.T3_ID equality is present

select distinct concat(t1.T1_VAL,t1.T1_VAR) as CC, group_concat(t2.T3_ID) as id 
  from t1 LEFT JOIN (t2,t3) on concat(T1_VAL,T1_VAR) = t2.T2_VALVAR and t2.T3_ID = t3.T3_ID  
  group by T1_ID having CC='valb';

After removing any of them:

select distinct concat(t1.T1_VAL,t1.T1_VAR) as CC 
  from t1 LEFT JOIN (t2,t3) on concat(T1_VAL,T1_VAR) = t2.T2_VALVAR and t2.T3_ID = t3.T3_ID  
  group by T1_ID having CC='valb';

select distinct concat(t1.T1_VAL,t1.T1_VAR) as CC, group_concat(t2.T3_ID) as id 
  from t1 LEFT JOIN (t2,t3) on concat(T1_VAL,T1_VAR) = t2.T2_VALVAR   
  group by T1_ID having CC='valb';

the result of SQL is correct. Removing t2.T3_ID = t3.T3_ID equality leads to another execution plan (t3 accessed with "scan" against "eq_ref").

Also the result is correct after modifying the HAVING clause to "HAVING CC='valc'":

select distinct concat(t1.T1_VAL,t1.T1_VAR) as CC, group_concat(t2.T3_ID) as id 
  from t1 LEFT JOIN (t2,t3) on concat(T1_VAL,T1_VAR) = t2.T2_VALVAR and t2.T3_ID = t3.T3_ID  
  group by T1_ID having CC='valc';

Comment by Oleg Smirnov [ 2022-05-19 ]

Applied the Monty's suggested patch and added a test case.

Generated at Thu Feb 08 09:33:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.