[MDEV-27201] Non-merged derived table: do not compute unused fields - Jira

XML

Word

Printable

(Filing this based on my observations from analyzing queries at a prospect considering migration to MariaDB)

The problem: If a non-merged derived table has unused columns, they are still created and computed.

Creating a column might not be a big deal
Computing a column can cause some unused overhead
The worst part is that Table Elimination cannot eliminate the parts of query plan that compute useless information.

Testcase:

create table t1 (

  group_id int,

  a int,

  b int,

  c int

);

insert into t1 select  seq/100, seq, seq, seq from seq_1_to_10000;

create table t2 (a int);

insert into t2 values (1),(2),(3);

explain

select

  T.group_id,

  T.MAXA

from

t2,

  (select

     group_id,

     max(a) as MAXA,

     sum(b) as SUMB

   from

t1

   group by group_id

  ) as T

where

 T.group_id = t2.a;

+------+-------------+------------+------+---------------+------+---------+-----------+-------+---------------------------------+

| id   | select_type | table      | type | possible_keys | key  | key_len | ref       | rows  | Extra                           |

+------+-------------+------------+------+---------------+------+---------+-----------+-------+---------------------------------+

|    1 | PRIMARY     | t2         | ALL  | NULL          | NULL | NULL    | NULL      | 3     | Using where                     |

|    1 | PRIMARY     | <derived2> | ref  | key0          | key0 | 5       | test.t2.a | 100   |                                 |

|    2 | DERIVED     | t1         | ALL  | NULL          | NULL | NULL    | NULL      | 10000 | Using temporary; Using filesort |

+------+-------------+------------+------+---------------+------+---------+-----------+-------+---------------------------------+

Note that sum(b) as SUMB is not used.

Put breakpoints in

one can still see that the column for SUM() is created and the value is computed.

How it affects table elimination

Add another table into the subquery:

create table t3 (

  pk int primary key,

  col int

);

insert into t3 select  seq, seq from seq_1_to_10000;

explain

select

  T.group_id,

  T.MAXA

from

t2,

  (select

     group_id,

     max(a) as MAXA,

     sum(b) as SUMB

   from

     t1 left join t3 on t3.pk=t1.c

   group by group_id

  ) as T

where

 T.group_id = t2.a;

+------+-------------+------------+------+---------------+------+---------+-----------+------+---------------------------------+

| id   | select_type | table      | type | possible_keys | key  | key_len | ref       | rows | Extra                           |

+------+-------------+------------+------+---------------+------+---------+-----------+------+---------------------------------+

|    1 | PRIMARY     | t2         | ALL  | NULL          | NULL | NULL    | NULL      | 3    | Using where                     |

|    1 | PRIMARY     | <derived2> | ref  | key0          | key0 | 5       | test.t2.a | 99   |                                 |

|    2 | DERIVED     | t1         | ALL  | NULL          | NULL | NULL    | NULL      | 9963 | Using temporary; Using filesort |

+------+-------------+------------+------+---------------+------+---------+-----------+------+---------------------------------+

Ok, t3 is eliminated.

Now, use t3.b in the SUM:

explain

select

  T.group_id,

  T.MAXA

from

t2,

  (select

     group_id,

     max(a) as MAXA,

     sum(t3.col) as SUMB

   from

     t1 left join t3 on t3.pk=t1.c

   group by group_id

  ) as T

where

 T.group_id = t2.a;

+------+-------------+------------+--------+---------------+---------+---------+-----------+------+---------------------------------+

| id   | select_type | table      | type   | possible_keys | key     | key_len | ref       | rows | Extra                           |

+------+-------------+------------+--------+---------------+---------+---------+-----------+------+---------------------------------+

|    1 | PRIMARY     | t2         | ALL    | NULL          | NULL    | NULL    | NULL      | 3    | Using where                     |

|    1 | PRIMARY     | <derived2> | ref    | key0          | key0    | 5       | test.t2.a | 99   |                                 |

|    2 | DERIVED     | t1         | ALL    | NULL          | NULL    | NULL    | NULL      | 9963 | Using temporary; Using filesort |

|    2 | DERIVED     | t3         | eq_ref | PRIMARY       | PRIMARY | 4       | test.t1.c | 1    | Using where                     |

+------+-------------+------------+--------+---------------+---------+---------+-----------+------+---------------------------------+

And now table elimination is unable to remove t3.

relates to

MDEV-26278 Table elimination does not work across derived tables

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.