Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32609

Derived subquery selecting from dummy table causes segv

Details

    Description

      CREATE TABLE v0 ( v1 INTEGER ) ENGINE = InnoDB ;
      INSERT INTO v0 ( v1 ) ( SELECT 2231626 <= NULL v1 FROM v0 WHERE v1 IN ( v1 ) ORDER BY v1 ) ORDER BY v1 , AVG ( v1 ) OVER ( ) DESC ;
      DROP SHOW CREATE TABLE v0 ; 
      

      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f1f87d04880 thread_stack 0x5fc00
      /usr/local/mysql/bin/mariadbd(__interceptor_backtrace+0x5b)[0x781b5b]
      mysys/stacktrace.c:215(my_print_stacktrace)[0x228cfae]
      sql/signal_handler.cc:0(handle_fatal_signal)[0x12bd0d2]
      sigaction.c:0(__restore_rt)[0x7f1faba9a420]
      sql/item.cc:3028(Item_field)[0x1315c2a]
      sql/sql_window.cc:3135(Window_funcs_sort::setup(THD*, SQL_SELECT*, List_iterator<Item_window_func>&, st_join_table*))[0x1113ef3]
      sql/sql_window.cc:3174(Window_funcs_computation::setup(THD*, List<Item_window_func>*, st_join_table*))[0x1115585]
      /usr/local/mysql/bin/mariadbd(_ZN4JOIN21make_aggr_tables_infoEv+0x3d67)[0xc34837]
      /usr/local/mysql/bin/mariadbd(_ZN4JOIN15optimize_stage2Ev+0xc891)[0xc0c5d1]
      /usr/local/mysql/bin/mariadbd(_ZN4JOIN14optimize_innerEv+0x3919)[0xc17249]
      /usr/local/mysql/bin/mariadbd(_ZN4JOIN8optimizeEv+0x176)[0xbffb46]
      sql/sql_derived.cc:1037(mysql_derived_optimize(THD*, LEX*, TABLE_LIST*))[0xa5bf21]
      /usr/local/mysql/bin/mariadbd(_Z27mysql_handle_single_derivedP3LEXP10TABLE_LISTj+0x172)[0xa5f042]
      /usr/local/mysql/bin/mariadbd(_ZN4JOIN14optimize_innerEv+0x3138)[0xc16a68]
      /usr/local/mysql/bin/mariadbd(_ZN4JOIN8optimizeEv+0x176)[0xbffb46]
      sql/sql_select.cc:5237(mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0xbe886f]
      sql/sql_select.cc:628(handle_select(THD*, LEX*, select_result*, unsigned long long))[0xbe7e59]
      /usr/local/mysql/bin/mariadbd(_Z21mysql_execute_commandP3THDb+0x9d8c)[0xb39e7c]
      sql/sql_class.h:2830(THD::enter_stage(PSI_stage_info_v1 const*, char const*, char const*, unsigned int))[0xb24c79]
      /usr/local/mysql/bin/mariadbd(_Z16dispatch_command19enum_server_commandP3THDPcjb+0x2cf8)[0xb1e648]
      sql/sql_parse.cc:1407(do_command(THD*, bool))[0xb25971]
      sql/sql_connect.cc:1416(do_handle_one_connection(CONNECT*, bool))[0xf0d066]
      sql/sql_connect.cc:1322(handle_one_connection)[0xf0caa9]
      perfschema/pfs.cc:2203(pfs_spawn_thread)[0x19d710b]
      nptl/pthread_create.c:478(start_thread)[0x7f1faba8e609]
      addr2line: DWARF error: section .debug_info is larger than its filesize! (0x93ef57 vs 0x530f28)
      /lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f1fab7a6133]

      Attachments

        Issue Links

          Activity

            ycp Yuchen Pei added a comment - - edited

            In the original case, if I change to default mtr engine (myisam) then the crash does not happen.

            ycp Yuchen Pei added a comment - - edited In the original case, if I change to default mtr engine (myisam) then the crash does not happen.
            ycp Yuchen Pei added a comment -

            Some low level analysis.

            Consider the simplified case

            (SELECT 5 a FROM dual ORDER BY a) ORDER BY AVG(a) OVER () ;

            The subquery (SELECT 5 a FROM dual ORDER BY a) is marked as
            derived, with the dummy table having derived_type == 10. The offending
            null valued field join_tab->table->field[0] was never non-null.

            For comparison, the following works, with the same query marked as
            derived, but it does not go into JOIN::make_aggr_tables_info() for
            the subquery because const_table_map is 1 instead of 0, which was
            updated in make_join_statistics(), and in the crashed case,
            make_join_statistics() was skipped because tables_list is empty.

            create table t (c int);
            (SELECT c FROM t ORDER BY c) ORDER BY AVG(c) OVER () ;
            drop table t;

            To take a step back, why is the subquery marked as derived in the
            first place?

            The kb[1] says

            > Derived tables are subqueries in the FROM clause.

            But there's no FROM before (SELECT 5 a ORDER BY a). Is there a
            "canonical form" of this query with a FROM?

            [1] https://mariadb.com/kb/en/optimizations-for-derived-tables/

            ycp Yuchen Pei added a comment - Some low level analysis. Consider the simplified case (SELECT 5 a FROM dual ORDER BY a) ORDER BY AVG(a) OVER () ; The subquery (SELECT 5 a FROM dual ORDER BY a) is marked as derived, with the dummy table having derived_type == 10. The offending null valued field join_tab->table->field [0] was never non-null. For comparison, the following works, with the same query marked as derived, but it does not go into JOIN::make_aggr_tables_info() for the subquery because const_table_map is 1 instead of 0, which was updated in make_join_statistics(), and in the crashed case, make_join_statistics() was skipped because tables_list is empty. create table t (c int ); ( SELECT c FROM t ORDER BY c) ORDER BY AVG (c) OVER () ; drop table t; To take a step back, why is the subquery marked as derived in the first place? The kb [1] says > Derived tables are subqueries in the FROM clause. But there's no FROM before (SELECT 5 a ORDER BY a). Is there a "canonical form" of this query with a FROM? [1] https://mariadb.com/kb/en/optimizations-for-derived-tables/
            ycp Yuchen Pei added a comment - - edited

            The reason that the original case fails under innodb but not myisam is
            that innodb's initial table flags do not contain
            HA_STATS_RECORDS_IS_EXACT but myisam does, which causes myisam
            case to enter the following if branch in make_join_statistics(),
            which results in not entering JOIN::make_aggr_tables_info()
            because const_table_map is 1.

            11.0 35cc4b6c058aeb905e8d802569444bdfce02193a

                if (!table->is_filled_at_execution() &&
                    (table->s->system ||
                     (table->file->stats.records <= 1 &&
                      (table->file->ha_table_flags() & HA_STATS_RECORDS_IS_EXACT)) ||
                     all_partitions_pruned_away) &&
            	!s->dependent &&
                    !table->fulltext_searched && !join->no_const_tables)
                {
                  set_position(join,const_count++,s,(KEYUSE*) 0);
                  no_rows_const_tables |= table->map;
                }

            Perhaps we could fix it by skipping make_aggr_tables_info() for
            the subqueries in both the simplified case and the original case,
            since the subqueries are both selecting const things (5 in the
            simplified case and 2231626 <= NULL in the original case) whether
            from dummy table or a real table, which is similar to a const table.

            ycp Yuchen Pei added a comment - - edited The reason that the original case fails under innodb but not myisam is that innodb's initial table flags do not contain HA_STATS_RECORDS_IS_EXACT but myisam does, which causes myisam case to enter the following if branch in make_join_statistics() , which results in not entering JOIN::make_aggr_tables_info() because const_table_map is 1. 11.0 35cc4b6c058aeb905e8d802569444bdfce02193a if (!table->is_filled_at_execution() && (table->s-> system || (table->file->stats.records <= 1 && (table->file->ha_table_flags() & HA_STATS_RECORDS_IS_EXACT)) || all_partitions_pruned_away) && !s->dependent && !table->fulltext_searched && !join->no_const_tables) { set_position(join,const_count++,s,(KEYUSE*) 0); no_rows_const_tables |= table->map; } Perhaps we could fix it by skipping make_aggr_tables_info() for the subqueries in both the simplified case and the original case, since the subqueries are both selecting const things (5 in the simplified case and 2231626 <= NULL in the original case) whether from dummy table or a real table, which is similar to a const table.
            ycp Yuchen Pei added a comment - - edited

            I don't think that was a good idea after all. I decided to check if we
            could fix it in a more direct way. The segv happens when it tries to
            construct an order for the window function. When the order by is on
            (), it tries to build the order. But the join table is dummy, and so
            join_tab->table->field[0] is NULL, causing the segv.

              if (sort_order == NULL) // No partition or order by clause.
              {
            //  [... 10 lines elided]
                ORDER *order= (ORDER *)alloc_root(thd->mem_root, sizeof(ORDER));
                memset(order, 0, sizeof(*order));
                Item_field *item=
                    new (thd->mem_root) Item_field(thd,  join_tab->table->field[0]); // segv!
                if (item)
                  item->set_refers_to_temp_table();
                order->item= (Item **)alloc_root(thd->mem_root, 2 * sizeof(Item *));
                order->item[1]= NULL;
                order->item[0]= item;
                order->field= join_tab->table->field[0];
                sort_order= order;
              }

            It fails in several cases:

            1. When selecting a const from a real non-empty table and avg over ():
            (SELECT 5 d FROM t ORDER BY d) ORDER BY AVG(d) OVER () ;
            2. When selecting a const from the dummy table and avg over ():
            (SELECT 5 d FROM dual ORDER BY d) ORDER BY AVG(d) OVER () ;
            3. When selecting a const from the dummy table and avg over a non-empty
            (SELECT 5 d FROM dual ORDER BY d) ORDER BY AVG(d) OVER (partition by d) ;

            The direct fix is to build the order correctly. I observed the order
            in the working case of selecting a const from a real non-empty table
            and avg over a non-empty:

            (SELECT 5 d FROM t ORDER BY d) ORDER BY AVG(d) OVER (partition by d) ;

            And That fixed the first case, including the original case of this
            ticket. See commit 1d30f996800cc6ea8e0c271ef66db9458d685e90.

            It does not fix selecting from a dummy table, because it tries
            to execute the window function while optimizing the derived subquer,
            before the result field is in place (item_win->record_field == NULL
            below).

            static
            bool save_window_function_values(List<Item_window_func>& window_functions,
                                             TABLE *tbl, uchar *rowid_buf)
            {
              List_iterator_fast<Item_window_func> iter(window_functions);
              JOIN_TAB *join_tab= tbl->reginfo.join_tab;
              tbl->file->ha_rnd_pos(tbl->record[0], rowid_buf);
              store_record(tbl, record[1]);
              while (Item_window_func *item_win= iter++)
                item_win->save_in_field(item_win->result_field, true);
            }

            save_window_function_values > compute_window_func > Window_func_runner::exec > Window_funcs_sort::exec > Window_funcs_computation::exec > AGGR_OP::end_send > sub_select_postjoin_aggr > do_select > JOIN::exec_inner > JOIN::exec > mysql_select > mysql_derived_fill > mysql_derived_optimize > mysql_handle_single_derived > JOIN::optimize_inner > JOIN::optimize > mysql_select > handle_select

            By contrast, when selecting from a real table, it could go past
            optimizing the derived subquery and populating the result field.

            Item_result_field::create_tmp_field_ex_from_handler > Item_result_field::create_tmp_field_ex > create_tmp_field > Create_tmp_table::add_fields > create_tmp_table > JOIN::create_postjoin_aggr_table > JOIN::make_aggr_tables_info > JOIN::optimize_stage2 > JOIN::optimize_inner > JOIN::optimize > mysql_select > handle_select

            The difference is caused by assigning true to subq_exit_fl when
            optimizing the derived subquery, which causes it to skip to
            setup_subq_exit during JOIN::optimize_stage2(), which causes
            it to mark fill_me TRUE, which causes the premature execution.

            JOIN::optimize_inner()

              if (!tables_list)
              {
                DBUG_PRINT("info",("No tables"));
                error= 0;
                subq_exit_fl= true;
                goto setup_subq_exit;
              }

            JOIN::optimize_inner > JOIN::optimize > mysql_derived_optimize > mysql_handle_single_derived > JOIN::optimize_inner > JOIN::optimize > mysql_select > handle_select

            JOIN::optimize_stage2()

              if (subq_exit_fl)
                goto setup_subq_exit;

            void SELECT_LEX::mark_const_derived(bool empty)
            {
            //  [... 2 lines elided]
              if (!(join && join->thd->lex->describe) && derived)
              {
            //  [... 2 lines elided]
                if (!master_unit()->is_unit_op() && !derived->is_merged_derived() &&
                    !(join && join->with_two_phase_optimization))
                  derived->fill_me= TRUE;
              }
            }

            ycp Yuchen Pei added a comment - - edited I don't think that was a good idea after all. I decided to check if we could fix it in a more direct way. The segv happens when it tries to construct an order for the window function. When the order by is on (), it tries to build the order. But the join table is dummy, and so join_tab->table->field [0] is NULL, causing the segv. if (sort_order == NULL) // No partition or order by clause. { // [... 10 lines elided] ORDER *order= (ORDER *)alloc_root(thd->mem_root, sizeof (ORDER)); memset (order, 0, sizeof (*order)); Item_field *item= new (thd->mem_root) Item_field(thd, join_tab->table->field[0]); // segv! if (item) item->set_refers_to_temp_table(); order->item= (Item **)alloc_root(thd->mem_root, 2 * sizeof (Item *)); order->item[1]= NULL; order->item[0]= item; order->field= join_tab->table->field[0]; sort_order= order; } It fails in several cases: 1. When selecting a const from a real non-empty table and avg over (): (SELECT 5 d FROM t ORDER BY d) ORDER BY AVG(d) OVER () ; 2. When selecting a const from the dummy table and avg over (): (SELECT 5 d FROM dual ORDER BY d) ORDER BY AVG(d) OVER () ; 3. When selecting a const from the dummy table and avg over a non-empty (SELECT 5 d FROM dual ORDER BY d) ORDER BY AVG(d) OVER (partition by d) ; The direct fix is to build the order correctly. I observed the order in the working case of selecting a const from a real non-empty table and avg over a non-empty: (SELECT 5 d FROM t ORDER BY d) ORDER BY AVG(d) OVER (partition by d) ; And That fixed the first case, including the original case of this ticket. See commit 1d30f996800cc6ea8e0c271ef66db9458d685e90. It does not fix selecting from a dummy table, because it tries to execute the window function while optimizing the derived subquer, before the result field is in place ( item_win->record_field == NULL below). static bool save_window_function_values(List<Item_window_func>& window_functions, TABLE *tbl, uchar *rowid_buf) { List_iterator_fast<Item_window_func> iter(window_functions); JOIN_TAB *join_tab= tbl->reginfo.join_tab; tbl->file->ha_rnd_pos(tbl->record[0], rowid_buf); store_record(tbl, record[1]); while (Item_window_func *item_win= iter++) item_win->save_in_field(item_win->result_field, true); } save_window_function_values > compute_window_func > Window_func_runner::exec > Window_funcs_sort::exec > Window_funcs_computation::exec > AGGR_OP::end_send > sub_select_postjoin_aggr > do_select > JOIN::exec_inner > JOIN::exec > mysql_select > mysql_derived_fill > mysql_derived_optimize > mysql_handle_single_derived > JOIN::optimize_inner > JOIN::optimize > mysql_select > handle_select By contrast, when selecting from a real table, it could go past optimizing the derived subquery and populating the result field. Item_result_field::create_tmp_field_ex_from_handler > Item_result_field::create_tmp_field_ex > create_tmp_field > Create_tmp_table::add_fields > create_tmp_table > JOIN::create_postjoin_aggr_table > JOIN::make_aggr_tables_info > JOIN::optimize_stage2 > JOIN::optimize_inner > JOIN::optimize > mysql_select > handle_select The difference is caused by assigning true to subq_exit_fl when optimizing the derived subquery, which causes it to skip to setup_subq_exit during JOIN::optimize_stage2() , which causes it to mark fill_me TRUE, which causes the premature execution. JOIN::optimize_inner() if (!tables_list) { DBUG_PRINT( "info" ,( "No tables" )); error= 0; subq_exit_fl= true ; goto setup_subq_exit; } JOIN::optimize_inner > JOIN::optimize > mysql_derived_optimize > mysql_handle_single_derived > JOIN::optimize_inner > JOIN::optimize > mysql_select > handle_select JOIN::optimize_stage2() if (subq_exit_fl) goto setup_subq_exit; void SELECT_LEX::mark_const_derived( bool empty) { // [... 2 lines elided] if (!(join && join->thd->lex->describe) && derived) { // [... 2 lines elided] if (!master_unit()->is_unit_op() && !derived->is_merged_derived() && !(join && join->with_two_phase_optimization)) derived->fill_me= TRUE; } }

            I observe one other odd thing.

            Put a breakpoint in JOIN::optimize().

              Thread 14 "mysqld" hit Breakpoint 1, JOIN::optimize (this=0x7fff24019288) at /home/psergey/dev-git2/11.0-review-mdev-32609/sql/sql_select.cc:1935
            (gdb) p dbug_print_select(select_lex)
              $100 = 0x555557c88280 <dbug_item_print_buf> "select __2.a AS a from ((select 5 AS a order by '')) __2 order by avg(__2.a) over ()"
            (gdb) p select_lex->window_funcs.elements
              $102 = 0
            (gdb) p select_lex->select_number
              $103 = 1
            

            select #1 has window functions, but window_funcs is empty?

            And for select #2 it is vice versa:

              Thread 14 "mysqld" hit Breakpoint 1, JOIN::optimize (this=0x7fff24019b40) at /home/psergey/dev-git2/11.0-review-mdev-32609/sql/sql_select.cc:1935
            (gdb) p dbug_print_select(select_lex)
              $104 = 0x555557c88280 <dbug_item_print_buf> "select 5 AS a order by ''"
            (gdb) p select_lex->window_funcs.elements
              $105 = 1
            (gdb) p select_lex->select_number
              $106 = 2
            (gdb) p select_lex->window_funcs.elem(0)
              $107 = (Item_window_func *) 0x7fff24017ac0
            (gdb) p dbug_print_item(select_lex->window_funcs.elem(0))
              $108 = 0x555557c88280 <dbug_item_print_buf> "avg(__2.a) over ()"
            

            I'm not sure if that contributes to the crash but this is worth looking at...

            psergei Sergei Petrunia added a comment - I observe one other odd thing. Put a breakpoint in JOIN::optimize(). Thread 14 "mysqld" hit Breakpoint 1, JOIN::optimize (this=0x7fff24019288) at /home/psergey/dev-git2/11.0-review-mdev-32609/sql/sql_select.cc:1935 (gdb) p dbug_print_select(select_lex) $100 = 0x555557c88280 <dbug_item_print_buf> "select __2.a AS a from ((select 5 AS a order by '')) __2 order by avg(__2.a) over ()" (gdb) p select_lex->window_funcs.elements $102 = 0 (gdb) p select_lex->select_number $103 = 1 select #1 has window functions, but window_funcs is empty? And for select #2 it is vice versa: Thread 14 "mysqld" hit Breakpoint 1, JOIN::optimize (this=0x7fff24019b40) at /home/psergey/dev-git2/11.0-review-mdev-32609/sql/sql_select.cc:1935 (gdb) p dbug_print_select(select_lex) $104 = 0x555557c88280 <dbug_item_print_buf> "select 5 AS a order by ''" (gdb) p select_lex->window_funcs.elements $105 = 1 (gdb) p select_lex->select_number $106 = 2 (gdb) p select_lex->window_funcs.elem(0) $107 = (Item_window_func *) 0x7fff24017ac0 (gdb) p dbug_print_item(select_lex->window_funcs.elem(0)) $108 = 0x555557c88280 <dbug_item_print_buf> "avg(__2.a) over ()" I'm not sure if that contributes to the crash but this is worth looking at...

            For comparison: if I rewrite the query to use a "proper" derived table:

            select * from (SELECT 5 a FROM dual ORDER BY a) T  ORDER BY AVG(a) OVER () ;
            

            then select_lex->window_funcs has the window function in select #1, not in select#2.
            And the query doesn't crash.

            psergei Sergei Petrunia added a comment - For comparison: if I rewrite the query to use a "proper" derived table: select * from ( SELECT 5 a FROM dual ORDER BY a) T ORDER BY AVG (a) OVER () ; then select_lex->window_funcs has the window function in select #1, not in select#2. And the query doesn't crash.

            Indeed, when parsing this query, we enter here:

            (gdb) wher
              #0  Lex_order_limit_lock::set_to (this=0x7fff240193c8, sel=0x7fff24016988) at /home/psergey/dev-git2/11.0-review-mdev-32609/sql/sql_lex.cc:10192
              #1  0x0000555555f3b6cb in LEX::add_tail_to_query_expression_body (this=0x7fff24005140, unit=0x7fff24018670, l=0x7fff240193c8) at /home/psergey/dev-git2/11.0-review-mdev-32609/sql/sql_lex.cc:10489
            

            and here we seem to be moving the ORDER BY clause from one select to another:

            =>    for (ORDER *order= order_list->first; order; order= order->next)
                    (*order->item)->walk(&Item::change_context_processor, FALSE,
                                         &sel->context);
            

            But Item::change_context_processor does nothing for Item_window_func. It doesn't de-register from select_lex->window_funcs it was before, and it doesn't register at its new select_lex.

            psergei Sergei Petrunia added a comment - Indeed, when parsing this query, we enter here: (gdb) wher #0 Lex_order_limit_lock::set_to (this=0x7fff240193c8, sel=0x7fff24016988) at /home/psergey/dev-git2/11.0-review-mdev-32609/sql/sql_lex.cc:10192 #1 0x0000555555f3b6cb in LEX::add_tail_to_query_expression_body (this=0x7fff24005140, unit=0x7fff24018670, l=0x7fff240193c8) at /home/psergey/dev-git2/11.0-review-mdev-32609/sql/sql_lex.cc:10489 and here we seem to be moving the ORDER BY clause from one select to another: => for (ORDER *order= order_list->first; order; order= order->next) (*order->item)->walk(&Item::change_context_processor, FALSE, &sel->context); But Item::change_context_processor does nothing for Item_window_func. It doesn't de-register from select_lex->window_funcs it was before, and it doesn't register at its new select_lex.
            ycp Yuchen Pei added a comment -

            Thanks psergei for your analysis. I was wondering whether MDEV-28619 which is marked as related to this ticket, has the same problem of misplaced Item_window_func (using alice's testcase[1] (also see below), and I don't think so. I think that the select_lex with the Item_window_func is the union, as expected. This is because it is a fake select lex prepared in subselect_union_engine::exec().

             SELECT 1 IN (SELECT 1 a UNION SELECT 5 ORDER BY LAST_VALUE (a) OVER XXX);
            

            [1] https://jira.mariadb.org/browse/MDEV-28619?focusedCommentId=224221&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-224221

            ycp Yuchen Pei added a comment - Thanks psergei for your analysis. I was wondering whether MDEV-28619 which is marked as related to this ticket, has the same problem of misplaced Item_window_func (using alice 's testcase [1] (also see below), and I don't think so. I think that the select_lex with the Item_window_func is the union, as expected. This is because it is a fake select lex prepared in subselect_union_engine::exec(). SELECT 1 IN ( SELECT 1 a UNION SELECT 5 ORDER BY LAST_VALUE (a) OVER XXX); [1] https://jira.mariadb.org/browse/MDEV-28619?focusedCommentId=224221&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-224221
            ycp Yuchen Pei added a comment -

            I also note that this Item_window_func misplacement also happens when the dummy table is replaced with a real one, but there's no crash and the test passes with expected results:

            11.0 295f3e51cc7db0f9931eee438d14756ab8a8d9e7

            create table t (c int);
            insert into t values (4), (5), (3);
            (SELECT c FROM t ORDER BY c) ORDER BY AVG(c) OVER () ; # outputs 3, 4, 5
            drop table t;

            Placing a breakpoint at JOIN::optimize(). When it first hits:

            1: select_lex = (SELECT_LEX *) 0x5290000e8150
            2: select_lex->select_number = 1
            4: dbug_print(select_lex) = 0x56299b0d1e00 <dbug_item_print_buf> "select __2.c AS c from ((select t.c AS c from t order by t.c)) __2 order by avg(__2.c) over ()"
            5: select_lex->window_funcs.elements = 0

            When it hits the second time:

            1: select_lex = (SELECT_LEX *) 0x5290000e6380
            2: select_lex->select_number = 2
            4: dbug_print(select_lex) = 0x56299b0d1e00 <dbug_item_print_buf> "select t.c AS c from t order by t.c"
            5: select_lex->window_funcs.elements = 1

            ycp Yuchen Pei added a comment - I also note that this Item_window_func misplacement also happens when the dummy table is replaced with a real one, but there's no crash and the test passes with expected results: 11.0 295f3e51cc7db0f9931eee438d14756ab8a8d9e7 create table t (c int ); insert into t values (4), (5), (3); ( SELECT c FROM t ORDER BY c) ORDER BY AVG (c) OVER () ; # outputs 3, 4, 5 drop table t; Placing a breakpoint at JOIN::optimize(). When it first hits: 1: select_lex = (SELECT_LEX *) 0x5290000e8150 2: select_lex->select_number = 1 4: dbug_print(select_lex) = 0x56299b0d1e00 <dbug_item_print_buf> "select __2.c AS c from ((select t.c AS c from t order by t.c)) __2 order by avg(__2.c) over ()" 5: select_lex->window_funcs.elements = 0 When it hits the second time: 1: select_lex = (SELECT_LEX *) 0x5290000e6380 2: select_lex->select_number = 2 4: dbug_print(select_lex) = 0x56299b0d1e00 <dbug_item_print_buf> "select t.c AS c from t order by t.c" 5: select_lex->window_funcs.elements = 1
            ycp Yuchen Pei added a comment -

            I doubt it is possible to make the parser set the initial context correctly without major surgery, so I tried out the idea of implementing change_context_processor for Item_window_func. It seems to work for the testcases in this ticket:

            c57d8a5e405 upstream/bb-11.0-mdev-32609 MDEV-32609 [poc] Implement Item_window_func::change_context_processor()
            

            ycp Yuchen Pei added a comment - I doubt it is possible to make the parser set the initial context correctly without major surgery, so I tried out the idea of implementing change_context_processor for Item_window_func. It seems to work for the testcases in this ticket: c57d8a5e405 upstream/bb-11.0-mdev-32609 MDEV-32609 [poc] Implement Item_window_func::change_context_processor()

            Ok, this patch updates select_lex->window_funcs.
            But there is also st_select_lex::add_window_spec() called which populates st_select_lex::window_specs and changes other members...
            Will we have to move those, too? (Please wait with coding this).

            psergei Sergei Petrunia added a comment - Ok, this patch updates select_lex->window_funcs . But there is also st_select_lex::add_window_spec() called which populates st_select_lex::window_specs and changes other members... Will we have to move those, too? (Please wait with coding this).

            A suggestion considered in discussion with Sanja:

            Should we just delay the calls to
            st_select_lex::add_window_spec()
            st_select_lex::add_window_func()
            until e.g. window function's fix_fields() call?

            Possible issues here are:

            • fix_fields() is called on each statement execution
            • The above functions update st_select_lex::fields_in_window_functions. That member is used in st_select_lex::get_cardinality_of_ref_ptrs_slice(), which is called before fix_fields is called for Item_windowfunc objects.
            psergei Sergei Petrunia added a comment - A suggestion considered in discussion with Sanja: Should we just delay the calls to st_select_lex::add_window_spec() st_select_lex::add_window_func() until e.g. window function's fix_fields() call? Possible issues here are: fix_fields() is called on each statement execution The above functions update st_select_lex::fields_in_window_functions . That member is used in st_select_lex::get_cardinality_of_ref_ptrs_slice(), which is called before fix_fields is called for Item_windowfunc objects.
            psergei Sergei Petrunia added a comment - - edited

            The problem part of the grammar, with my suggestion in comments:

                    | query_expression_body_ext_parens
                      {
                        Lex->push_select(!$1->first_select()->next_select() ? 
                                           $1->first_select() : $1->fake_select_lex); 
                        // psergey: here, we could modify the context so that window functions
                        // are listed somewhere else than the current select.
                        // for example Select->parse_win_function_data_here .
                        // (Note that this will still work if query_expression_tail has subqueries)
                      }
                      query_expression_tail
                      {
                        // Inside this function, in Lex_order_limit_lock::set_to,  
                        // the data from Select->parse_win_function_data_here will be copied
                        // to the same select where ORDER BY clause is put.
                        if (!($$= Lex->add_tail_to_query_expression_body_ext_parens($1, $3)))
                           MYSQL_YYABORT;
             
                      }
            

            This still looks messy. Maybe, it's easier to walk the parse tree and collect info about window functions?
            Note that we will only have to walk the select list and order by. We won't need to walk into subqueries. We won't need to walk the WHERE clause, which can be large.

            psergei Sergei Petrunia added a comment - - edited The problem part of the grammar, with my suggestion in comments: | query_expression_body_ext_parens { Lex->push_select(!$1->first_select()->next_select() ? $1->first_select() : $1->fake_select_lex); // psergey: here, we could modify the context so that window functions // are listed somewhere else than the current select. // for example Select->parse_win_function_data_here . // (Note that this will still work if query_expression_tail has subqueries) } query_expression_tail { // Inside this function, in Lex_order_limit_lock::set_to, // the data from Select->parse_win_function_data_here will be copied // to the same select where ORDER BY clause is put. if (!($$= Lex->add_tail_to_query_expression_body_ext_parens($1, $3))) MYSQL_YYABORT;   } This still looks messy. Maybe, it's easier to walk the parse tree and collect info about window functions? Note that we will only have to walk the select list and order by. We won't need to walk into subqueries. We won't need to walk the WHERE clause, which can be large.

            Final version of the fix suggestion:

            At the parser stage, Window Functions should not "register" themselves in st_select_lex::window_funcs and st_select_lex::window_specs.

            This should occur at a later stage, in JOIN::prepare, before the st_select_lex::get_cardinality_of_ref_ptrs_slice() call.

            Walk the SELECT's select list and order by clause and collect the contents of st_select_lex::window_funcs and st_select_lex::window_specs.

            psergei Sergei Petrunia added a comment - Final version of the fix suggestion: At the parser stage, Window Functions should not "register" themselves in st_select_lex::window_funcs and st_select_lex::window_specs. This should occur at a later stage, in JOIN::prepare, before the st_select_lex::get_cardinality_of_ref_ptrs_slice() call. Walk the SELECT's select list and order by clause and collect the contents of st_select_lex::window_funcs and st_select_lex::window_specs.
            ycp Yuchen Pei added a comment - - edited

            As per psergei, the runtime team has taken over this ticket.

            I think it makes sense, if the cause of the bug is the inconsistent attachments of Item_window_func to SELECT_LEX, resulted from actions during parsing / preparation.

            ycp Yuchen Pei added a comment - - edited As per psergei , the runtime team has taken over this ticket. I think it makes sense, if the cause of the bug is the inconsistent attachments of Item_window_func to SELECT_LEX, resulted from actions during parsing / preparation.

            People

              Unassigned Unassigned
              csfuzz csfuzz
              Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.