[MDEV-37936] Followup to MDEV-36321: out_rows for GROUP BY: use of item names? - Jira

XML

Word

Printable

Details

Type: Task
Status: In Review (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: 13.0
Component/s: None
Labels:
None

Sprint:
Q4/2025 Server Development

Description

Filing this question as a separate task:

https://jira.mariadb.org/browse/MDEV-36321?focusedCommentId=315191&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-315191

fixVersion will be determined once we understand the problem/fix

TODO

Testcase in the comment doesn't show anything

Quoting that comment:

select * from
t2,
(
  select max(value), grp_id from t1 group by grp_id
  union all
  select max(value), grp_id from t1 group by grp_id
)  dt
where t2.a= dt.grp_id;
here, we would expect records per key to be 2, but the 2nd grp_id within the union gets it's own name instance.

Trying to reproduce :

create table t1 (

  grp_id int,

  value int,

  index (grp_id)

);

insert into t1 select

  A.seq, B.seq

from

  seq_1_to_10000 A,

  seq_1_to_100 B

create table t2 (a int);

insert into t2 select seq from seq_1_to_5;

analyze table t1,t2;

explain

select * from

t2,

  select max(value), grp_id from t1 group by grp_id

  union all

  select max(value), grp_id from t1 group by grp_id

)  DT

where t2.a= DT.grp_id;

And I get (without the fix, tip cset 049ee29e7e28c1c7c3c41638adc54efa2cb10c2a):

+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+

| id   | select_type | table      | type | possible_keys | key  | key_len | ref       | rows   | Extra                           |

+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+

|    1 | PRIMARY     | t2         | ALL  | NULL          | NULL | NULL    | NULL      | 5      | Using where                     |

|    1 | PRIMARY     | <derived2> | ref  | key0          | key0 | 5       | test.t2.a | 2      |                                 |

|    2 | DERIVED     | t1         | ALL  | NULL          | NULL | NULL    | NULL      | 998412 | Using temporary; Using filesort |

|    3 | UNION       | t1         | ALL  | NULL          | NULL | NULL    | NULL      | 998412 | Using temporary; Using filesort |

+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+

4 rows in set (0.001 sec)

so, rec_per_key=2 ?

Fixing the testcase

create table t1a(

  grp_id_2 int,

  value int,

  index (grp_id_2)

);

insert into t1a select * from t1;

analyze table t1a;

explain

select * from

t2,

  select max(value), grp_id from t1 group by grp_id

  union all

  select max(value), grp_id_2 from t1a group by grp_id_2

)  DT

where t2.a= DT.grp_id;

gives

+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+

| id   | select_type | table      | type | possible_keys | key  | key_len | ref       | rows   | Extra                           |

+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+

|    1 | PRIMARY     | t2         | ALL  | NULL          | NULL | NULL    | NULL      | 5      | Using where                     |

|    1 | PRIMARY     | <derived2> | ref  | key0          | key0 | 5       | test.t2.a | 200    |                                 |

|    2 | DERIVED     | t1         | ALL  | NULL          | NULL | NULL    | NULL      | 998412 | Using temporary; Using filesort |

|    3 | UNION       | t1a        | ALL  | NULL          | NULL | NULL    | NULL      | 998412 | Using temporary; Using filesort |

+------+-------------+------------+------+---------------+------+---------+-----------+--------+---------------------------------+

rows=200 while it should be rows=2.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mdev37936-psergey-use-column-number.diff
5 kB
2025-10-29 19:42

Issue Links

relates to

MDEV-36321 Indexes on derived tables with GROUP BY produce wrong out_rows estimates

Closed

Activity

People

Assignee:: Sergei Petrunia

Reporter:: Sergei Petrunia

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2025-10-23 14:42

Updated:: 2025-12-23 18:14

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.