Details
-
Task
-
Status: In Progress (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
-
Q4/2025 Server Development
Description
Filing this question as a separate task:
fixVersion will be determined once we understand the problem/fix
TODO
Testcase in the comment doesn't show anything
Quoting that comment:
select * fromt2,( select max(value), grp_id from t1 group by grp_id union all select max(value), grp_id from t1 group by grp_id) dt where t2.a= dt.grp_id;
here, we would expect records per key to be 2, but the 2nd grp_id within the union gets it's own name instance.
Trying to reproduce :
create table t1 ( |
grp_id int, |
value int, |
index (grp_id) |
);
|
|
insert into t1 select |
A.seq, B.seq
|
from |
seq_1_to_10000 A,
|
seq_1_to_100 B
|
;
|
|
create table t2 (a int); |
insert into t2 select seq from seq_1_to_5; |
|
analyze table t1,t2; |
explain
|
select * from |
t2,
|
(
|
select max(value), grp_id from t1 group by grp_id |
union all |
select max(value), grp_id from t1 group by grp_id |
) DT
|
where t2.a= DT.grp_id; |
And I get (without the fix, tip cset 049ee29e7e28c1c7c3c41638adc54efa2cb10c2a):
+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+
|
| 1 | PRIMARY | t2 | ALL | NULL | NULL | NULL | NULL | 5 | Using where |
|
| 1 | PRIMARY | <derived2> | ref | key0 | key0 | 5 | test.t2.a | 2 | |
|
| 2 | DERIVED | t1 | ALL | NULL | NULL | NULL | NULL | 998412 | Using temporary; Using filesort |
|
| 3 | UNION | t1 | ALL | NULL | NULL | NULL | NULL | 998412 | Using temporary; Using filesort |
|
+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+
|
4 rows in set (0.001 sec)
|
so, rec_per_key=2 ?
Fixing the testcase
create table t1a(
|
grp_id_2 int,
|
value int,
|
index (grp_id_2)
|
);
|
insert into t1a select * from t1;
|
analyze table t1a;
|
explain
|
select * from
|
t2,
|
(
|
select max(value), grp_id from t1 group by grp_id
|
union all
|
select max(value), grp_id_2 from t1a group by grp_id_2
|
) DT
|
where t2.a= DT.grp_id;
|
gives
+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+
|
| 1 | PRIMARY | t2 | ALL | NULL | NULL | NULL | NULL | 5 | Using where |
|
| 1 | PRIMARY | <derived2> | ref | key0 | key0 | 5 | test.t2.a | 200 | |
|
| 2 | DERIVED | t1 | ALL | NULL | NULL | NULL | NULL | 998412 | Using temporary; Using filesort |
|
| 3 | UNION | t1a | ALL | NULL | NULL | NULL | NULL | 998412 | Using temporary; Using filesort |
|
+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+
|
rows=200 while it should be rows=2.
Attachments
Issue Links
- relates to
-
MDEV-36321 Indexes on derived tables with GROUP BY produce wrong out_rows estimates
-
- Closed
-