Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33081

Crash on second execution of PS for query over many views

Details

    • Bug
    • Status: Stalled (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 10.4(EOL), 10.5, 10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
    • 10.5, 10.6, 10.11
    • Optimizer
    • None

    Description

      The following test case causes a crash of the server on the second execution of used PS:

      CREATE TABLE t1 (a int PRIMARY KEY) ENGINE=MYISAM;
      CREATE TABLE t2 (a int PRIMARY KEY) ENGINE=MYISAM;
      CREATE TABLE t3 (a int PRIMARY KEY) ENGINE=MYISAM;
      CREATE TABLE t4 (a int PRIMARY KEY) ENGINE=MYISAM;
      CREATE TABLE t5 (a int PRIMARY KEY) ENGINE=MYISAM;
      CREATE TABLE t6 (a int PRIMARY KEY) ENGINE=MYISAM;
      CREATE TABLE t7 (a int PRIMARY KEY) ENGINE=MYISAM;
      CREATE TABLE t8 (a int PRIMARY KEY) ENGINE=MYISAM;
       
      INSERT INTO t1 VALUES (1), (2);
      INSERT INTO t2 VALUES (1), (2);
      INSERT INTO t3 VALUES (1), (2);
      INSERT INTO t4 VALUES (1), (2);
      INSERT INTO t5 VALUES (1), (2);
      INSERT INTO t6 VALUES (1), (2);
      INSERT INTO t7 VALUES (1), (2);
      INSERT INTO t8 VALUES (1), (2);
       
      CREATE VIEW v1 AS
      SELECT t8.a
      FROM t1
           LEFT JOIN t2 ON t2.a = t1.a
           LEFT JOIN t3 ON t3.a = t2.a
           LEFT JOIN t4 ON t4.a = t3.a
           LEFT JOIN t5 ON t5.a = t4.a
           LEFT JOIN t6 ON t6.a = t5.a
           LEFT JOIN t7 ON t7.a = t6.a
           LEFT JOIN t8 ON t8.a = t7.a
      ;
       
      CREATE TABLE t9 (a int PRIMARY KEY, b int) ENGINE=MYISAM;
      INSERT INTO t9 VALUES (1,10), (2,20);
       
      CREATE VIEW v2 AS
      SELECT t8.a, t9.b + 100 as b
      FROM t1
           LEFT JOIN t2 ON t2.a = t1.a
           LEFT JOIN t3 ON t3.a = t2.a
           LEFT JOIN t8 ON t8.a = t3.a
           LEFT JOIN t9 ON t9.a = t8.a
      ;
       
      let $q=
      SELECT dt.b FROM
      (
      SELECT v2.b + 200 as b
      FROM v1 AS s1
           LEFT JOIN v1 AS s2 ON s2.a = s1.a
           LEFT JOIN v1 AS s3 ON s3.a = s2.a
           LEFT JOIN v1 AS s4 ON s4.a = s3.a
           LEFT JOIN v1 AS s5 ON s5.a = s4.a
           LEFT JOIN v1 AS s6 ON s6.a = s5.a
           LEFT JOIN v1 AS s7 ON s7.a = s6.a
           LEFT JOIN v2 ON v2.a = s7.a
      WHERE s1.a > 1
      ) dt;
       
      eval $q;
       
      eval PREPARE stmt FROM "$q";
      EXECUTE stmt;
      EXECUTE stmt;
      DEALLOCATE PREPARE stmt;
       
      DROP VIEW v1,v2;
      DROP TABLE t1,t2,t3,t4,t5,t6,t7,t8,t9;
      

      The crash happens in Dep_analysis_context::get_field_value() with
      the following stack:

      sql/opt_table_elimination.cc:1619(Dep_analysis_context::get_field_value(Field*))[0x56016b2f8999]
      sql/opt_table_elimination.cc:1268(build_eq_mods_for_cond(THD*, Dep_analysis_context*, Dep_module_expr**, unsigned int*, Item*))[0x56016b2f7e85]
      sql/opt_table_elimination.cc:861(check_func_dependency(JOIN*, unsigned long long, List_iterator<TABLE_LIST>*, TABLE_LIST*, Item*))[0x56016b2f6f0e]
      sql/opt_table_elimination.cc:756(eliminate_tables_for_list(JOIN*, List<TABLE_LIST>*, unsigned long long, Item*, unsigned long long, Json_writer_array*))[0x56016b2f6ad8]
      sql/opt_table_elimination.cc:681(eliminate_tables(JOIN*))[0x56016b2f690e]
      sql/sql_select.cc:5249(make_join_statistics(JOIN*, List<TABLE_LIST>&, st_dynamic_array*))[0x56016b13f05c]
      sql/sql_select.cc:2388(JOIN::optimize_inner())[0x56016b134bbb]
      sql/sql_select.cc:1731(JOIN::optimize())[0x56016b13242a]
      sql/sql_select.cc:4832(mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x56016b13dc53]
      sql/sql_select.cc:454(handle_select(THD*, LEX*, select_result*, unsigned long))[0x56016b12cffa]
      sql/sql_parse.cc:6523(execute_sqlcom_select(THD*, TABLE_LIST*))[0x56016b0ef6a6]
      sql/sql_parse.cc:3980(mysql_execute_command(THD*))[0x56016b0e5fad]
      sql/sql_prepare.cc:5025(Prepared_statement::execute(String*, bool))[0x56016b115286]
      sql/sql_prepare.cc:4472(Prepared_statement::execute_loop(String*, bool, unsigned char*, unsigned char*))[0x56016b113587]
      sql/sql_prepare.cc:3532(mysql_sql_stmt_execute(THD*))[0x56016b110dad]
      sql/sql_parse.cc:3997(mysql_execute_command(THD*))[0x56016b0e5ff2]
      sql/sql_parse.cc:8062(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x56016b0f3658]
      sql/sql_parse.cc:1860(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x56016b0dfad0]
      sql/sql_parse.cc:1379(do_command(THD*))[0x56016b0de2f4]
      sql/sql_connect.cc:1419(do_handle_one_connection(CONNECT*))[0x56016b27b578]
      sql/sql_connect.cc:1324(handle_one_connection)[0x56016b27b2d4]
      

      With the setting

      set optimizer_switch="table_elimination=off";
      

      both execution of the used PS return the expected result set.

      Attachments

        Issue Links

          Activity

            oleg.smirnov Oleg Smirnov added a comment -

            The problem seems to be related to the unsuccessful merging of view v2.

            After the parsing stage dbug_print(select_lex) prints the following statement:

            "select dt.b AS b from (select v2.b + 200 AS b from (((((((v1 s1 left join v1 s2 on(s2.a = s1.a)) left join v1 s3 on(s3.a = s2.a)) left join v1 s4 on(s4.a = s3.a)) left join v1 s5 on(s5.a = s4.a)) left join v1 s6 on(s6.a = s5.a)) left join v1 s7 on(s7.a = s6.a)) left join v2 on(v2.a = s7.a)) where s1.a > 1) dt"
            

            At 1st execution mysql_derived_prepare() sets flag "mergeable" for v2 and creates the field translation table. After names resolution select_lex looks like

            "select t9.b + 100 + 200 AS b 
            from (select t9.b + 100 + 200 AS b 
                  from (((((((v1 s1 left join v1 s2 on(t8.a = t8.a)) left join v1 s3 on(t8.a = t8.a)) left join v1 s4 on(t8.a = t8.a)) left join v1 s5 on(t8.a = t8.a)) left join v1 s6 on(t8.a = t8.a)) left join v1 s7 on(t8.a = t8.a)) left join v2 on(t8.a" = t8.a)) where t8.a > 1) dt"
            

            Note the t9.b in the select lists. It's the reference that was translated from v2.b as the view v2 is expected to be merged.

            At the optimization stage we hit the following limitation:

            mysql_derived_merge()

             
                /*
                  Check whether there is enough free bits in table map to merge subquery.
                  If not - materialize it. This check isn't cached so when there is a big
                  and small subqueries, and the bigger one can't be merged it wouldn't
                  block the smaller one.
                */
                if (parent_lex->get_free_table_map(&map, &tablenr) ||
                   dt_select->leaf_tables.elements + tablenr > MAX_TABLES)
                {
                  /* There is no enough table bits, fall back to materialization. */
                  cause= "Not enough table bits to merge subquery";
                  goto unconditional_materialization;
                }
            

            Then the next code block works which rolls back the preparation for merging and sets the view v2 to "materialized":

            mysql_derived_merge()

            unconditional_materialization:
              derived->change_refs_to_fields();
              derived->set_materialized_derived();
              if (!derived->table || !derived->table->is_created())
                res= mysql_derived_create(thd, lex, derived);
              goto exit_merge;
            

            After derived->change_refs_to_fields() select_lex looks as follows (only first 300 characters shown):

            "select v2.b + 200 AS b from (select v2.b + 200 AS b from (((((((((((((((t1 left join t2 on(t2.a = t1.a)) left join t3 on(t3.a = t2.a)) left join t4 on(t4.a = t3.a)) left join t5 on(t5.a = t4.a)) left join t6 on(t6.a = t5.a)) left join t7 on(t7.a = t6.a)) left join t8 on(t8.a = t7.a))) left join (((("
            

            There is again v2.b in the select lists, which is correct.

            But the merging of derived tables and views is performed only during the first execution:

            JOIN::optimize_inner()

              if (select_lex->first_cond_optimization)
              {
                //Do it only for the first execution
                /* Merge all mergeable derived tables/views in this SELECT. */
                if (select_lex->handle_derived(thd->lex, DT_MERGE))
                  DBUG_RETURN(TRUE);  
              }
            

            So at the 2nd execution we lack the call of derived->change_refs_to_fields() and end up have incorrect references to t9.b in the select lists:

            "select t9.b + 100 + 200 AS b 
            from (select t9.b + 100 + 200 AS b 
                  from (((((((v1 s1 left join v1 s2 on(t8.a = t8.a)) left join v1 s3 on(t8.a = t8.a)) left join v1 s4 on(t8.a = t8.a)) left join v1 s5 on(t8.a = t8.a)) left join v1 s6 on(t8.a = t8.a)) left join v1 s7 on(t8.a = t8.a)) left join v2 on(t8.a" = t8.a)) where t8.a > 1) dt"
            

            As a result, there are incorrect table_maps and a crash at table elimination (which is not related to the elimination itself, however)

            oleg.smirnov Oleg Smirnov added a comment - The problem seems to be related to the unsuccessful merging of view v2 . After the parsing stage dbug_print(select_lex) prints the following statement: "select dt.b AS b from (select v2.b + 200 AS b from (((((((v1 s1 left join v1 s2 on(s2.a = s1.a)) left join v1 s3 on(s3.a = s2.a)) left join v1 s4 on(s4.a = s3.a)) left join v1 s5 on(s5.a = s4.a)) left join v1 s6 on(s6.a = s5.a)) left join v1 s7 on(s7.a = s6.a)) left join v2 on(v2.a = s7.a)) where s1.a > 1) dt" At 1st execution mysql_derived_prepare() sets flag "mergeable" for v2 and creates the field translation table. After names resolution select_lex looks like "select t9.b + 100 + 200 AS b from (select t9.b + 100 + 200 AS b from (((((((v1 s1 left join v1 s2 on(t8.a = t8.a)) left join v1 s3 on(t8.a = t8.a)) left join v1 s4 on(t8.a = t8.a)) left join v1 s5 on(t8.a = t8.a)) left join v1 s6 on(t8.a = t8.a)) left join v1 s7 on(t8.a = t8.a)) left join v2 on(t8.a" = t8.a)) where t8.a > 1) dt" Note the t9.b in the select lists. It's the reference that was translated from v2.b as the view v2 is expected to be merged. At the optimization stage we hit the following limitation: mysql_derived_merge() /* Check whether there is enough free bits in table map to merge subquery. If not - materialize it. This check isn't cached so when there is a big and small subqueries, and the bigger one can't be merged it wouldn't block the smaller one. */ if (parent_lex->get_free_table_map(&map, &tablenr) || dt_select->leaf_tables.elements + tablenr > MAX_TABLES) { /* There is no enough table bits, fall back to materialization. */ cause= "Not enough table bits to merge subquery"; goto unconditional_materialization; } Then the next code block works which rolls back the preparation for merging and sets the view v2 to "materialized": mysql_derived_merge() unconditional_materialization: derived->change_refs_to_fields(); derived->set_materialized_derived(); if (!derived->table || !derived->table->is_created()) res= mysql_derived_create(thd, lex, derived); goto exit_merge; After derived->change_refs_to_fields() select_lex looks as follows (only first 300 characters shown): "select v2.b + 200 AS b from (select v2.b + 200 AS b from (((((((((((((((t1 left join t2 on(t2.a = t1.a)) left join t3 on(t3.a = t2.a)) left join t4 on(t4.a = t3.a)) left join t5 on(t5.a = t4.a)) left join t6 on(t6.a = t5.a)) left join t7 on(t7.a = t6.a)) left join t8 on(t8.a = t7.a))) left join ((((" There is again v2.b in the select lists, which is correct. But the merging of derived tables and views is performed only during the first execution: JOIN::optimize_inner() if (select_lex->first_cond_optimization) { //Do it only for the first execution /* Merge all mergeable derived tables/views in this SELECT. */ if (select_lex->handle_derived(thd->lex, DT_MERGE)) DBUG_RETURN(TRUE); } So at the 2nd execution we lack the call of derived->change_refs_to_fields() and end up have incorrect references to t9.b in the select lists: "select t9.b + 100 + 200 AS b from (select t9.b + 100 + 200 AS b from (((((((v1 s1 left join v1 s2 on(t8.a = t8.a)) left join v1 s3 on(t8.a = t8.a)) left join v1 s4 on(t8.a = t8.a)) left join v1 s5 on(t8.a = t8.a)) left join v1 s6 on(t8.a = t8.a)) left join v1 s7 on(t8.a = t8.a)) left join v2 on(t8.a" = t8.a)) where t8.a > 1) dt" As a result, there are incorrect table_maps and a crash at table elimination (which is not related to the elimination itself, however)
            oleg.smirnov Oleg Smirnov added a comment - - edited

            igor, can you please review the proposed solution in branch bb-10.4-mdev-33081?

            commit 69bd5437c45490a53f075f8ac17744a3d9b31ccd (HEAD -> bb-10.4-MDEV-33081)
            Author: Oleg Smirnov <olernov@gmail.com>
            Date:   Sat Jan 13 17:32:16 2024 +0700
             
                MDEV-33081 Crash on second execution of PS for query over many views
                
                On the preparation phase a view or derived table can be marked as mergeable,
                and the appropriate field_translation table will be set up for it.
                But the merging itself may fail due to lack of free bits in table map
                (see mysql_derived_merge()). In such a case the merging is cancelled,
                and the view/DT falls back to materialization. But the field_translation
                table persists, and at 2nd execution of a prepared statement the name
                resolution procedure resolves names incorrectly using the outdated
                field_translation table.
                
                The solution is to reset field_translation to NULL if the merging
                of a DT/view fails. However, there are two exceptions
                when field_translation is used later at the execution phase:
                 - pushdown of a DT to an external storage engine
                 - statements looking like
                   INSERT INTO t1 (SELECT tmp.a FROM (select * FROM t1) as tmp)
                In such cases field_translation must not be reset
            

            oleg.smirnov Oleg Smirnov added a comment - - edited igor , can you please review the proposed solution in branch bb-10.4-mdev-33081 ? commit 69bd5437c45490a53f075f8ac17744a3d9b31ccd (HEAD -> bb-10.4-MDEV-33081) Author: Oleg Smirnov <olernov@gmail.com> Date: Sat Jan 13 17:32:16 2024 +0700   MDEV-33081 Crash on second execution of PS for query over many views On the preparation phase a view or derived table can be marked as mergeable, and the appropriate field_translation table will be set up for it. But the merging itself may fail due to lack of free bits in table map (see mysql_derived_merge()). In such a case the merging is cancelled, and the view/DT falls back to materialization. But the field_translation table persists, and at 2nd execution of a prepared statement the name resolution procedure resolves names incorrectly using the outdated field_translation table. The solution is to reset field_translation to NULL if the merging of a DT/view fails. However, there are two exceptions when field_translation is used later at the execution phase: - pushdown of a DT to an external storage engine - statements looking like INSERT INTO t1 (SELECT tmp.a FROM (select * FROM t1) as tmp) In such cases field_translation must not be reset

            An explanation of the proper second execution of the reported PS with
            set optimizer_switch='table_elimination=off' is requested.

            igor Igor Babaev (Inactive) added a comment - An explanation of the proper second execution of the reported PS with set optimizer_switch='table_elimination=off' is requested.
            oleg.smirnov Oleg Smirnov added a comment -

            The code outside the table elimination turns out to be tolerant to such inconsistency, at least for the given test case. But I've managed to reproduce the error with the table elimination turned off. Look at the modification of your original test case :

            # <...  this part is unchanged ...>
             
            CREATE VIEW v2 AS
            SELECT t8.a as b, t9.b as a
            FROM t1
                 LEFT JOIN t2 ON t2.a = t1.a
                 LEFT JOIN t3 ON t3.a = t2.a
                 LEFT JOIN t8 ON t8.a = t3.a
                 LEFT JOIN t9 ON t9.a = t8.a;
             
            SET optimizer_switch = 'table_elimination=off';
             
            let $q=
            SELECT dt.b FROM
            (
            SELECT v2.b + 200 as b
            FROM v1 AS s1
                 LEFT JOIN v1 AS s2 ON s2.a = s1.a
                 LEFT JOIN v1 AS s3 ON s3.a = s2.a
                 LEFT JOIN v1 AS s4 ON s4.a = s3.a
                 LEFT JOIN v1 AS s5 ON s5.a = s4.a
                 LEFT JOIN v1 AS s6 ON s6.a = s5.a
                 LEFT JOIN v1 AS s7 ON s7.a = s6.a
                 LEFT JOIN v2 ON v2.a = s7.a
            WHERE s1.a > 1
            ) dt;
             
            --echo # Sample result (regular statement)
            eval $q;
             
            eval PREPARE stmt FROM "$q";
            EXECUTE stmt;
            EXECUTE stmt;
            DEALLOCATE PREPARE stmt;
            

            The correct result is NULL, but the non-patched 10.4 outputs 202 at the second execution of the PS. At the same time, the patched version produces NULL at both executions.
            This test case can also be added to the commit if the patch is accepted.

            oleg.smirnov Oleg Smirnov added a comment - The code outside the table elimination turns out to be tolerant to such inconsistency, at least for the given test case. But I've managed to reproduce the error with the table elimination turned off. Look at the modification of your original test case : # <... this part is unchanged ...>   CREATE VIEW v2 AS SELECT t8.a as b, t9.b as a FROM t1 LEFT JOIN t2 ON t2.a = t1.a LEFT JOIN t3 ON t3.a = t2.a LEFT JOIN t8 ON t8.a = t3.a LEFT JOIN t9 ON t9.a = t8.a;   SET optimizer_switch = 'table_elimination=off';   let $q= SELECT dt.b FROM ( SELECT v2.b + 200 as b FROM v1 AS s1 LEFT JOIN v1 AS s2 ON s2.a = s1.a LEFT JOIN v1 AS s3 ON s3.a = s2.a LEFT JOIN v1 AS s4 ON s4.a = s3.a LEFT JOIN v1 AS s5 ON s5.a = s4.a LEFT JOIN v1 AS s6 ON s6.a = s5.a LEFT JOIN v1 AS s7 ON s7.a = s6.a LEFT JOIN v2 ON v2.a = s7.a WHERE s1.a > 1 ) dt;   --echo # Sample result (regular statement) eval $q;   eval PREPARE stmt FROM "$q"; EXECUTE stmt; EXECUTE stmt; DEALLOCATE PREPARE stmt; The correct result is NULL, but the non-patched 10.4 outputs 202 at the second execution of the PS. At the same time, the patched version produces NULL at both executions. This test case can also be added to the commit if the patch is accepted.
            oleg.smirnov Oleg Smirnov added a comment -

            One more note: no tables are eliminated in the test cases above. This can be checked be setting breakpoints at

            static void mark_as_eliminated(JOIN *join, TABLE_LIST *tbl,  Json_writer_array* trace_eliminate_tables)
            

            and

            bool Item_subselect::mark_as_eliminated_processor(void *arg)
            

            Those breakpoint are never hit.

            oleg.smirnov Oleg Smirnov added a comment - One more note: no tables are eliminated in the test cases above. This can be checked be setting breakpoints at static void mark_as_eliminated(JOIN *join, TABLE_LIST *tbl, Json_writer_array* trace_eliminate_tables) and bool Item_subselect::mark_as_eliminated_processor(void *arg) Those breakpoint are never hit.

            This bug will be fixed when a fix for MDEV-30073 is pushed into 10.4

            igor Igor Babaev (Inactive) added a comment - This bug will be fixed when a fix for MDEV-30073 is pushed into 10.4

            People

              Johnston Rex Johnston
              igor Igor Babaev (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.