[MDEV-35673] Test coverage for outer references - Jira

Details

Type: Task
Status: Open (View Workflow)
Priority: Critical
Resolution: Unresolved
Fix Version/s: 10.5
Component/s: Data Definition - Procedure, Optimizer, Parser, Prepared Statements, Tests
Labels:
None

Description

This is an umbrella task.
Attached is/will be a file with reasonable coverage for testing outer references.
Linked will be new bugs created by those tests.

Attachments

Issue Links

includes

MDEV-35674 2nd execution outer context crash

Open

MDEV-35675 Wrong result when query run as a prepared statement

Open

MDEV-35676 Wrong result on 2nd execution with outer reference

Open

Activity

Ascending order - Click to sort in descending order

Rex Johnston added a comment - 2025-02-13 23:58 - edited

select * from t1                                             -- #1

where t1c in

  select * from                                              -- #2

    select t2a from t2                                       -- #3

    where t2a in

      select t3a from (select * from t3) table3              -- #4, #5

      where t3a in

        select t4a from t4                                   -- #6

        where t4a < any(select t5c from t5 where t5a > t3a)  -- #7

  ) dt

run as a prepared statement, this causes the server to crash on 2nd execution.
During JOIN::prepare on select #7 (innermost one) we run fix_fields on our where clause (t5a>t3a).
(re)fixing t5a, we attempt name resolution, search the fields context for the table containing the
field t3a. During the 1st execution, we "saved" the TABLE_LIST pointing to "table3" in item->cached_table,
so we search this table reference, finding our field.
During the first execution, after resolving the field t3a we need to mark as dependent, all the selects
between where the object is used and where the object is resolved. Prior to optimization, this is
select #7 (used) and select #4 (resolved).
During optimization in the 1st execution, selects # 1, 3, 4 & 5 are all collapsed into a semijoin

select t1.t1a AS t1a,t1.t1b AS t1b,t1.t1c AS t1c from ((((t4) join (t3)) join t2)) join t1 where t1.t1c in (subquery#2)

Our table list pointer has a select_lex from select #1
as we are fixing an outer reference, we now mark all the selects from where t3a is used (select #7) to where it is found (select #1)
as dependent in mark_select_range_as_dependent().
We start with select #7 and mark the containing item's used_tables_cache with OUTER_REF_TABLE_BIT and iterate outward.
The next most outwards using this->context.outer_select() is select #6. This is marked as above.
The next most outwards using this->context.outer_select() is select #4. This is marked as above.
The next most outwards using this->context.outer_select() is select #3.
this is a merged derived table dt, it has been merged into select #2, master_unit()->item is null
there is no containing item for the unit, so we crash trying to follow a null pointer.

The simplest fix for this is to encapsulate context->select_lex and use accessor functions that skip merged selects.

Rex Johnston added a comment - 2025-02-13 23:58 - edited select * from t1 -- #1 where t1c in ( select * from -- #2 ( select t2a from t2 -- #3 where t2a in ( select t3a from ( select * from t3) table3 -- #4, #5 where t3a in ( select t4a from t4 -- #6 where t4a < any ( select t5c from t5 where t5a > t3a) -- #7 ) ) ) dt ) run as a prepared statement, this causes the server to crash on 2nd execution. During JOIN::prepare on select #7 (innermost one) we run fix_fields on our where clause (t5a>t3a). (re)fixing t5a, we attempt name resolution, search the fields context for the table containing the field t3a. During the 1st execution, we "saved" the TABLE_LIST pointing to "table3" in item->cached_table, so we search this table reference, finding our field. During the first execution, after resolving the field t3a we need to mark as dependent, all the selects between where the object is used and where the object is resolved. Prior to optimization, this is select #7 (used) and select #4 (resolved). During optimization in the 1st execution, selects # 1, 3, 4 & 5 are all collapsed into a semijoin select t1.t1a AS t1a,t1.t1b AS t1b,t1.t1c AS t1c from ((((t4) join (t3)) join t2)) join t1 where t1.t1c in (subquery#2) Our table list pointer has a select_lex from select #1 as we are fixing an outer reference, we now mark all the selects from where t3a is used (select #7) to where it is found (select #1) as dependent in mark_select_range_as_dependent(). We start with select #7 and mark the containing item's used_tables_cache with OUTER_REF_TABLE_BIT and iterate outward. The next most outwards using this->context.outer_select() is select #6. This is marked as above. The next most outwards using this->context.outer_select() is select #4. This is marked as above. The next most outwards using this->context.outer_select() is select #3. this is a merged derived table dt, it has been merged into select #2, master_unit()->item is null there is no containing item for the unit, so we crash trying to follow a null pointer. The simplest fix for this is to encapsulate context->select_lex and use accessor functions that skip merged selects.

Rex Johnston added a comment - 2025-02-14 02:08

by skipping over the merged select, we avoid the above problem, but now we see the exact same issue with this select

select * from t6, v1 where t6a=v1a and t6b in                     -- select 1

  select sqrt(v2b) from v2 where                                  -- select 2

    select max(t4a) from t4, t5                                   -- select 3

    where t5a = t4a and t4c < v1b and t5b < v2b

  ) <> 0

2nd execution, we are resolving v2b in the where clause from select #3.
We are marking the select #3 to select #2 as dependent.
During the first execution, the view v2 is merged into the select #1.
Select #2 is merged into select #1, which ends up looking like this

select t6.t6a AS t6a,t6.t6b AS t6b,t6.t6c AS t6c,t7.t7a AS v1a,t7.t7b AS v1b

from t6 semi join (t7 join t5) join t7

where t6.t6a = t7.t7a and 1 and

  (subquery#3) <> 0 and

  t6b = sqrt(v2b) and t7.t7c > 4 and t7.t7a = t5.t5a and t7.t7c > 4

we see that skipping over select #2 we land on select #1 for marking as dependent.
This makes little sense as it is the outermost query with no containing Item.
We crash in the same place as above.

We fix this by advancing the last select lex (in the above case select #2) along its
merged_into path (to select #1).

This causes the above query to now crash in Item_field::fix_fields, whilst fixing the same field.
As we have merged everything up to the top level, our last_checked_context is pointing
at select #1. We need a check to ensure we do not execute this loop more than once if we are
already at the top select

  for (;

       outer_context;

       outer_context= outer_context->get_outer_context())

+   if (outer_context->get_select_lex() == select)

+     break;

Rex Johnston added a comment - 2025-02-14 02:08 by skipping over the merged select, we avoid the above problem, but now we see the exact same issue with this select select * from t6, v1 where t6a=v1a and t6b in -- select 1 ( select sqrt(v2b) from v2 where -- select 2 ( select max (t4a) from t4, t5 -- select 3 where t5a = t4a and t4c < v1b and t5b < v2b ) <> 0 ) 2nd execution, we are resolving v2b in the where clause from select #3. We are marking the select #3 to select #2 as dependent. During the first execution, the view v2 is merged into the select #1. Select #2 is merged into select #1, which ends up looking like this select t6.t6a AS t6a,t6.t6b AS t6b,t6.t6c AS t6c,t7.t7a AS v1a,t7.t7b AS v1b from t6 semi join (t7 join t5) join t7 where t6.t6a = t7.t7a and 1 and (subquery#3) <> 0 and t6b = sqrt(v2b) and t7.t7c > 4 and t7.t7a = t5.t5a and t7.t7c > 4 we see that skipping over select #2 we land on select #1 for marking as dependent. This makes little sense as it is the outermost query with no containing Item. We crash in the same place as above. We fix this by advancing the last select lex (in the above case select #2) along its merged_into path (to select #1). This causes the above query to now crash in Item_field::fix_fields, whilst fixing the same field. As we have merged everything up to the top level, our last_checked_context is pointing at select #1. We need a check to ensure we do not execute this loop more than once if we are already at the top select for (; outer_context; outer_context= outer_context->get_outer_context()) { + if (outer_context->get_select_lex() == select) + break ;

Rex Johnston added a comment - 2025-02-14 02:50 - edited

select t1a from t1                           -- select 1, parent

where t1a in

  select t2c from t2 where t2a >= some               -- select 2, child

    select t3a from t3 where t3b in                  -- select 3, parent

      select t4a from t4 where t4b > t2c             -- select 4, child

    union

    select t5a from t5 where t5b in                  -- select 5, parent

      select t6a from t6 where t6b in                -- select 6, child, parent

        select t7a from t7 where t7b >= t1c          -- select 7, child

);

we now crash (false assert) in a call from Item_field::fix_outer_field
to mark_as_dependent called with

last_checked_context->get_select_lex() == context->get_select_lex()

while we are calling fix_fields on t1c.

Rex Johnston added a comment - 2025-02-14 02:50 - edited select t1a from t1 -- select 1, parent where t1a in ( select t2c from t2 where t2a >= some -- select 2, child ( select t3a from t3 where t3b in -- select 3, parent ( select t4a from t4 where t4b > t2c -- select 4, child ) union select t5a from t5 where t5b in -- select 5, parent ( select t6a from t6 where t6b in -- select 6, child, parent ( select t7a from t7 where t7b >= t1c -- select 7, child ) ) ) ); we now crash (false assert) in a call from Item_field::fix_outer_field to mark_as_dependent called with last_checked_context->get_select_lex() == context->get_select_lex() while we are calling fix_fields on t1c .

MariaDB Server

Test coverage for outer references

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration