Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34506

2nd execution name resolution problem with pushdown into unions

Details

    Description

      create table t1 (c1 int, c2 int, c3 int);
      insert into t1 values (1,2,3),(4,5,6);
      create table t2 like t1;
      insert into t2 values (7,8,9),(10,11,12);
      create view v1 (f1, f2, f3) as select * from (
        select c1, c2, sum(c3) as s from t1 group by c1, c2 having s > 2
        union
        select c1, c2, sum(c3) as u from t2 group by c1, c2 having u > 3 ) d2;
      prepare stmt from 'select * from v1 where f2 > 0';
      execute stmt;
      execute stmt;
      

      first execution gives the correct answer

      +------+------+------+
      | f1   | f2   | f3   |
      +------+------+------+
      |    1 |    2 |    3 |
      |    4 |    5 |    6 |
      |    7 |    8 |    9 |
      |   10 |   11 |   12 |
      +------+------+------+
      

      2nd execution does not

      ERROR 1356 (HY000): View 'test.v1' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them
      

      I have solved this problem in the linked MDEV, but it will be worth backporting the fix to earlier versions.

      Attachments

        Issue Links

          Activity

            Johnston Rex Johnston added a comment -

            When pushing a condition into all selects of a unit with more than one select, items are renamed so we can re-use the condition being pushed (without needing to clone and rename elements in the condition).

            These names need to be saved and reset for correct name resolution on second execution of prepared statements. Taking advantage of existing mechanisms for resetting item list names (select_lex::save_item_list_names() and mysql_derived_reinit()) we can save and reset the names of only the select_lex's with overwritten names.

            Johnston Rex Johnston added a comment - When pushing a condition into all selects of a unit with more than one select, items are renamed so we can re-use the condition being pushed (without needing to clone and rename elements in the condition). These names need to be saved and reset for correct name resolution on second execution of prepared statements. Taking advantage of existing mechanisms for resetting item list names (select_lex::save_item_list_names() and mysql_derived_reinit()) we can save and reset the names of only the select_lex's with overwritten names.
            Johnston Rex Johnston added a comment -

            Hi igor
            I discovered this when formulating pushdown tests for 31466. I think it needs to be applied to maintained builds.
            Thanks, Rex

            Johnston Rex Johnston added a comment - Hi igor I discovered this when formulating pushdown tests for 31466. I think it needs to be applied to maintained builds. Thanks, Rex
            Johnston Rex Johnston added a comment - - edited

            A step by step analysis of the issue.

            The following SQL statements illustrate the problem here

            create table t1 (c1 int, c2 int, c3 int);
            insert into t1 values (1,2,3),(4,5,6);
            create table t2 (c4 int, c5 int, c6 int);
            insert into t2 values (7,8,9),(10,11,12);
            prepare stmt from
            'analyze format=json
              select * from
              (
                select c1, c2, sum(c3) as s from t1 group by c1, c2 having s > 2
                union
                select c4, c5, sum(c6) as u from t2 group by c4, c5 having u > 3
              ) dt
              where c2 > 0';
            execute stmt;
            execute stmt;
            

            during the first execution of our statement stmt , JOIN::prepare() handles derived statements, indirectly calling mysql_derived_prepare(). This calls prepare on the unit represented by

                select c1, c2, sum(c3) as s from t1 group by c1, c2 having s > 2
                union
                select c4, c5, sum(c6) as u from t2 group by c4, c5 having u > 3
            

            fix_fields() on our having clause(s) resolves u > 3 by matching u to the item for sum(c6) as u

            JOIN::optimize() is called before actual statement execution.
            is_materialized_derived() is true, so pushdown_cond_for_derived() is called for derived TABLE_LIST for the unit show above.
            The pushed outer cond is c2 > 0.

            This is pushed into select 1 in the unit (select t1.c1 AS c1,t1.c2 AS c2,sum(t1.c3) AS s from t1 group by t1.c1, t1.c2 having s > 2)
            then into select 2 (select t2.c4 AS c4,t2.c5 AS c5,sum(t2.c6) AS u from t2 group by t2.c4,t2.c5 having u > 3).
            As part of this, it does the following for the reasons in the comment.

                /*
                  Rename the columns of all non-first selects of a union to be compatible
                  by names with the columns of the first select. It will allow to use copies
                  of the same expression pushed into having clauses of different selects.
                */
                if (sl != first_sl)
                {
                  DBUG_ASSERT(sl->item_list.elements == first_sl->item_list.elements);
                  List_iterator_fast<Item> it(sl->item_list);
                  List_iterator_fast<Item> nm_it(unit->types);
                  while (Item *item= it++)
                    item->share_name_with(nm_it++);
                }
            

            This permanently renames the item list overwriting 'sum(t2.c6) AS u' to 'sum(t2.c6) AS s'
            this is not reversed in mysql_derived_reinit().

            During the second execution, the unit now looks like this

                select c1, c2, sum(c3) as s from t1 group by c1, c2 having s > 2
                union
                select c4, c5, sum(c6) as s from t2 group by c4, c5 having u > 3
            

            During fix_fields(), name resolution on u > 3 will now fail.

            Johnston Rex Johnston added a comment - - edited A step by step analysis of the issue. The following SQL statements illustrate the problem here create table t1 (c1 int , c2 int , c3 int ); insert into t1 values (1,2,3),(4,5,6); create table t2 (c4 int , c5 int , c6 int ); insert into t2 values (7,8,9),(10,11,12); prepare stmt from 'analyze format=json select * from ( select c1, c2, sum(c3) as s from t1 group by c1, c2 having s > 2 union select c4, c5, sum(c6) as u from t2 group by c4, c5 having u > 3 ) dt where c2 > 0' ; execute stmt; execute stmt; during the first execution of our statement stmt , JOIN::prepare() handles derived statements, indirectly calling mysql_derived_prepare() . This calls prepare on the unit represented by select c1, c2, sum (c3) as s from t1 group by c1, c2 having s > 2 union select c4, c5, sum (c6) as u from t2 group by c4, c5 having u > 3 fix_fields() on our having clause(s) resolves u > 3 by matching u to the item for sum(c6) as u JOIN::optimize() is called before actual statement execution. is_materialized_derived() is true, so pushdown_cond_for_derived() is called for derived TABLE_LIST for the unit show above. The pushed outer cond is c2 > 0 . This is pushed into select 1 in the unit (select t1.c1 AS c1,t1.c2 AS c2,sum(t1.c3) AS s from t1 group by t1.c1, t1.c2 having s > 2) then into select 2 (select t2.c4 AS c4,t2.c5 AS c5,sum(t2.c6) AS u from t2 group by t2.c4,t2.c5 having u > 3). As part of this, it does the following for the reasons in the comment. /* Rename the columns of all non-first selects of a union to be compatible by names with the columns of the first select. It will allow to use copies of the same expression pushed into having clauses of different selects. */ if (sl != first_sl) { DBUG_ASSERT(sl->item_list.elements == first_sl->item_list.elements); List_iterator_fast<Item> it(sl->item_list); List_iterator_fast<Item> nm_it(unit->types); while (Item *item= it++) item->share_name_with(nm_it++); } This permanently renames the item list overwriting 'sum(t2.c6) AS u' to 'sum(t2.c6) AS s' this is not reversed in mysql_derived_reinit() . During the second execution, the unit now looks like this select c1, c2, sum (c3) as s from t1 group by c1, c2 having s > 2 union select c4, c5, sum (c6) as s from t2 group by c4, c5 having u > 3 During fix_fields() , name resolution on u > 3 will now fail.
            igor Igor Babaev added a comment -

            The following query over the tables from the above commit also fails on the second execution:

            select * from
               ( 
                  select c1,  sum(c3) as s from t1 group by c1
                  union
                  select c4 as c, sum(c6) as u from t2 group by c
               ) dt 
            where c1 > 0 
            

            MariaDB [test]> prepare stmt from "
                "> select * from
                ">    ( 
                ">       select c1,  sum(c3) as s from t1 group by c1
                ">       union
                ">       select c4 as c, sum(c6) as u from t2 group by c
                ">    ) dt 
                "> where c1 > 0 
                "> ";
            Query OK, 0 rows affected (0.002 sec)
            Statement prepared
             
            MariaDB [test]> execute stmt;
            +------+------+
            | c1   | s    |
            +------+------+
            |    1 |    3 |
            |    4 |    6 |
            |    7 |    9 |
            |   10 |   12 |
            +------+------+
            4 rows in set (0.004 sec)
             
            MariaDB [test]> execute stmt;
            ERROR 1054 (42S22): Unknown column 'c' in 'group statement'
            
            

            As we can see this query does not contain any having clauses.

            igor Igor Babaev added a comment - The following query over the tables from the above commit also fails on the second execution: select * from ( select c1, sum (c3) as s from t1 group by c1 union select c4 as c, sum (c6) as u from t2 group by c ) dt where c1 > 0 MariaDB [test]> prepare stmt from " "> select * from "> ( "> select c1, sum(c3) as s from t1 group by c1 "> union "> select c4 as c, sum(c6) as u from t2 group by c "> ) dt "> where c1 > 0 "> "; Query OK, 0 rows affected (0.002 sec) Statement prepared   MariaDB [test]> execute stmt; +------+------+ | c1 | s | +------+------+ | 1 | 3 | | 4 | 6 | | 7 | 9 | | 10 | 12 | +------+------+ 4 rows in set (0.004 sec)   MariaDB [test]> execute stmt; ERROR 1054 (42S22): Unknown column 'c' in 'group statement' As we can see this query does not contain any having clauses.
            igor Igor Babaev added a comment -

            Ok to push into 10.5

            igor Igor Babaev added a comment - Ok to push into 10.5
            Johnston Rex Johnston added a comment -

            for release notes: grouping operators referring to column aliases in unions inside derived tables can cause name resolution problems with prepared statements.

            Johnston Rex Johnston added a comment - for release notes: grouping operators referring to column aliases in unions inside derived tables can cause name resolution problems with prepared statements.

            People

              Johnston Rex Johnston
              Johnston Rex Johnston
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.