[MDEV-26025] Server crashes while executing query with CTE in PS/SP Created: 2021-06-27  Updated: 2021-11-16  Resolved: 2021-07-21

Status: Closed
Project: MariaDB Server
Component/s: Data Definition - Procedure, Optimizer - CTE
Affects Version/s: 10.5.11, 10.2, 10.3, 10.4, 10.5, 10.6
Fix Version/s: 10.2.40, 10.3.31, 10.4.21, 10.5.12, 10.6.4

Type: Bug Priority: Critical
Reporter: Peter VARGA Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: crash, feedback, regression
Environment:

CentOS 7


Issue Links:
Problem/Incident
causes MDEV-26202 Unexpected failure with query using i... Closed
Relates
relates to MDEV-26108 Crash with query referencing twice CT... Closed
relates to MDEV-26825 Bogus error for query with two usage ... Closed

 Description   

Running a quite long CTE [over 170 lines] in a procedure suddenly crashes the server. It looks to me that it started with 10.5.11 as before i worked perfectly. Below the dump. For the moment I don't want to publish the CTE but I can do it privately.

Running the CTE command NOT in a stored procedure works so I have for the moment a work-around.

{{Server version: 10.5.11-MariaDB
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=2
max_threads=153
thread_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467873 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fdf0c0009b8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fdf68117c90 thread_stack 0x49000
??:0(my_print_stacktrace)[0x559fcba7579e]
??:0(handle_fatal_signal)[0x559fcb47a457]
sigaction.c:0(__restore_rt)[0x7fdf81fa4630]
??:0(Item_cond_and::mark_as_condition_AND_part(TABLE_LIST*))[0x559fcb4b2d96]
??:0(setup_conds(THD*, TABLE_LIST*, List<TABLE_LIST>&, Item**))[0x559fcb20de29]
??:0(JOIN::prepare(TABLE_LIST*, Item*, unsigned int, st_order*, bool, st_order*, Item*, st_order*, st_select_lex*, st_select_lex_unit*))[0x559fcb2bedcf]
??:0(st_select_lex_unit::prepare_join(THD*, st_select_lex*, select_result*, unsigned long long, bool))[0x559fcb31b7ed]
??:0(st_select_lex_unit::prepare(TABLE_LIST*, select_result*, unsigned long long))[0x559fcb31fda6]
??:0(TABLE_LIST::find_derived_handler(THD*))[0x559fcb2335b2]
??:0(mysql_handle_single_derived(LEX*, TABLE_LIST*, unsigned int))[0x559fcb2325ad]
??:0(st_select_lex::handle_derived(LEX*, unsigned int))[0x559fcb24d3d7]
??:0(TABLE_LIST::handle_derived(LEX*, unsigned int))[0x559fcb335f06]
??:0(st_select_lex::handle_derived(LEX*, unsigned int))[0x559fcb24d3d7]
??:0(TABLE_LIST::handle_derived(LEX*, unsigned int))[0x559fcb335f06]
??:0(mysql_prepare_insert(THD*, TABLE_LIST*, List<Item>&, List<Item>, List<Item>&, List<Item>&, enum_duplicates, Item*, bool))[0x559fcb239bd2]
??:0(mysql_insert_select_prepare(THD*, select_result*))[0x559fcb23e216]
??:0(mysql_execute_command(THD*))[0x559fcb278900]
??:0(sp_instr_stmt::exec_core(THD*, unsigned int*))[0x559fcb1ce506]
??:0(sp_lex_keeper::reset_lex_and_exec_core(THD*, unsigned int*, bool, sp_instr*))[0x559fcb1d6fe2]
??:0(sp_instr_stmt::execute(THD*, unsigned int*))[0x559fcb1d79dc]
??:0(sp_head::execute(THD*, bool))[0x559fcb1d2524]
??:0(sp_head::execute_procedure(THD*, List<Item>*))[0x559fcb1d3c1f]
??:0(comp_ne_creator(bool))[0x559fcb26aaca]
??:0(Sql_cmd_call::execute(THD*))[0x559fcb26f48a]
??:0(mysql_execute_command(THD*))[0x559fcb2772be]
??:0(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x559fcb27b1c5]
??:0(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x559fcb27d891]
??:0(do_command(THD*))[0x559fcb27f03f]
??:0(do_handle_one_connection(CONNECT*, bool))[0x559fcb369b52]
??:0(handle_one_connection)[0x559fcb369e14]
??:0(MyCTX_nopad::finish(unsigned char*, unsigned int*))[0x559fcb6c756d]
pthread_create.c:0(start_thread)[0x7fdf81f9cea5]
??:0(__clone)[0x7fdf814b79fd]
}}

{{Connection ID (thread ID): 10
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off}}



 Comments   
Comment by Elena Stepanova [ 2021-06-28 ]

Peter V,

Could you please upload the query, structures of all involved tables, definition of the procedure, and your cnf file(s) to ftp://ftp.askmonty.org/private?

Comment by Peter VARGA [ 2021-06-28 ]

Hm, I may now sound stupid but I am asked for a user. What should I enter?

Comment by Peter VARGA [ 2021-06-28 ]

This was really a very stupid question I asked. I uploaded now the archive MDEV-26025.rar

@Issue:
The very strange thing is that it doesn't need data in the tables at all. Just run the bellow statements - yes, they are two identical statements - and the 2nd causes the server to crash. I checked the dump in a new database and the server always crashes so it must work with the dump I provided you.

{{CALL CheckVehicleConflicts('562639cf-7ef5-4404-ad6a-f8a98b0c1fb6');
CALL CheckVehicleConflicts('562639cf-7ef5-4404-ad6a-f8a98b0c1fb6');
}}

Comment by Elena Stepanova [ 2021-06-29 ]

Thanks for the provided data. Reproducible as described.

Resolved ASAN stack trace from the original test case:

10.5 fc2ff464

==1201693==ERROR: AddressSanitizer: heap-use-after-free on address 0x62b0000f27c8 at pc 0x56155d1b2486 bp 0x7f9d42dc50e0 sp 0x7f9d42dc50d0
READ of size 8 at 0x62b0000f27c8 thread T14
    #0 0x56155d1b2485 in base_list_iterator::next() /data/src/10.5/sql/sql_list.h:431
    #1 0x56155d1b2485 in List_iterator<Item>::operator++(int) /data/src/10.5/sql/sql_list.h:596
    #2 0x56155d1b2485 in Item_cond_and::mark_as_condition_AND_part(TABLE_LIST*) /data/src/10.5/sql/item_cmpfunc.cc:5398
    #3 0x56155c87f000 in setup_conds(THD*, TABLE_LIST*, List<TABLE_LIST>&, Item**) /data/src/10.5/sql/sql_base.cc:8377
    #4 0x56155cb183d0 in setup_without_group /data/src/10.5/sql/sql_select.cc:731
    #5 0x56155cb183d0 in JOIN::prepare(TABLE_LIST*, Item*, unsigned int, st_order*, bool, st_order*, Item*, st_order*, st_select_lex*, st_select_lex_
unit*) /data/src/10.5/sql/sql_select.cc:1285
    #6 0x56155cc80fa3 in st_select_lex_unit::prepare_join(THD*, st_select_lex*, select_result*, unsigned long long, bool) /data/src/10.5/sql/sql_unio
n.cc:1088
    #7 0x56155cc92777 in st_select_lex_unit::prepare(TABLE_LIST*, select_result*, unsigned long long) /data/src/10.5/sql/sql_union.cc:1481
    #8 0x56155c8e8686 in mysql_derived_prepare /data/src/10.5/sql/sql_derived.cc:839
    #9 0x56155c8e4f2a in mysql_handle_single_derived(LEX*, TABLE_LIST*, unsigned int) /data/src/10.5/sql/sql_derived.cc:200
    #10 0x56155c94e455 in LEX::handle_list_of_derived(TABLE_LIST*, unsigned int) /data/src/10.5/sql/sql_lex.h:4453
    #11 0x56155c94e455 in st_select_lex::handle_derived(LEX*, unsigned int) /data/src/10.5/sql/sql_lex.cc:4930
    #12 0x56155ccfece6 in TABLE_LIST::handle_derived(LEX*, unsigned int) /data/src/10.5/sql/table.cc:9118
    #13 0x56155c90296f in LEX::handle_list_of_derived(TABLE_LIST*, unsigned int) /data/src/10.5/sql/sql_lex.h:4453
    #14 0x56155c90296f in mysql_prepare_insert(THD*, TABLE_LIST*, List<Item>&, List<Item>*, List<Item>&, List<Item>&, enum_duplicates, Item**, bool) 
/data/src/10.5/sql/sql_insert.cc:1591
    #15 0x56155c91357e in mysql_insert_select_prepare(THD*, select_result*) /data/src/10.5/sql/sql_insert.cc:3722
    #16 0x56155c9f0fef in mysql_execute_command(THD*) /data/src/10.5/sql/sql_parse.cc:4753
    #17 0x56155c7b1458 in sp_instr_stmt::exec_core(THD*, unsigned int*) /data/src/10.5/sql/sp_head.cc:3767
    #18 0x56155c7cd719 in sp_lex_keeper::reset_lex_and_exec_core(THD*, unsigned int*, bool, sp_instr*) /data/src/10.5/sql/sp_head.cc:3494
    #19 0x56155c7cfc33 in sp_instr_stmt::execute(THD*, unsigned int*) /data/src/10.5/sql/sp_head.cc:3673
    #20 0x56155c7bd3e3 in sp_head::execute(THD*, bool) /data/src/10.5/sql/sp_head.cc:1437
    #21 0x56155c7c1f4c in sp_head::execute_procedure(THD*, List<Item>*) /data/src/10.5/sql/sp_head.cc:2449
    #22 0x56155c9c77e8 in do_execute_sp /data/src/10.5/sql/sql_parse.cc:3090
    #23 0x56155c9d9e1d in Sql_cmd_call::execute(THD*) /data/src/10.5/sql/sql_parse.cc:3336
    #24 0x56155c9f0050 in mysql_execute_command(THD*) /data/src/10.5/sql/sql_parse.cc:6052
    #25 0x56155c9b564c in mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) /data/src/10.5/sql/sql_parse.cc:8096
    #26 0x56155c9e2e1a in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) /data/src/10.5/sql/sql_parse.cc:1891
    #27 0x56155c9ea04a in do_command(THD*) /data/src/10.5/sql/sql_parse.cc:1370
    #28 0x56155cd8d5ec in do_handle_one_connection(CONNECT*, bool) /data/src/10.5/sql/sql_connect.cc:1410
    #29 0x56155cd8e184 in handle_one_connection /data/src/10.5/sql/sql_connect.cc:1312
    #30 0x56155d98c2b8 in pfs_spawn_thread /data/src/10.5/storage/perfschema/pfs.cc:2201
    #31 0x7f9d624a3608 in start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:477
    #32 0x7f9d62076292 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)

Smaller test case derived from the original one:

CREATE TABLE t (a int, b varchar(15));
INSERT INTO t VALUES (1,'foo'),(2,'bar');
 
PREPARE stmt FROM "
  WITH cte1 AS (
           SELECT COUNT(*) FROM t WHERE  b = 'foobar' AND a = 10
  ),   cte2 AS (
           SELECT COUNT(*) FROM cte1
  ),   cte3 AS (
           SELECT COUNT(*) FROM cte2
  ),   cte4 AS (
           SELECT COUNT(*) FROM cte3
  )
  SELECT * FROM cte3
";
 
EXECUTE stmt;
EXECUTE stmt;

(Stored procedure was replaced with a prepared statement for the purpose of bisecting, as the stored procedure wouldn't work on 10.2 till recently due to MDEV-20411).

Crashes with somewhat varying stack traces are reproducible on 10.2-10.6, debug and non-debug builds.

The failure started happening after this commit in 10.2:

commit 43c9fcefc07d2f42b65950e76adfbc3c1e8acb28
Author: Igor Babaev
Date:   Wed May 12 19:32:29 2021 -0700
 
    MDEV-23886 Reusing CTE inside a function fails with table doesn't exist

Additionally, here is a slightly different test case (select-star instead of COUNT), it causes a debug assertion failure instead of a crash:

CREATE TABLE t (a int, b varchar(15));
INSERT INTO t VALUES (1,'foo'),(2,'bar');
 
CREATE PROCEDURE sp()
  WITH cte1 AS (
           SELECT * FROM t WHERE  b = 'foobar' AND a = 10
  ),   cte2 AS (
           SELECT * FROM cte1
  ),   cte3 AS (
           SELECT * FROM cte2
  ),   cte4 AS (
           SELECT * FROM cte3
  )
  SELECT * FROM cte3
;
 
CALL sp();
CALL sp();
 
# Cleanup
DROP PROCEDURE sp;
DROP TABLE t;

10.2 4e4f742e

mysqld: /data/src/10.2/sql/sql_prepare.cc:3094: void reinit_stmt_before_use(THD*, LEX*): Assertion `sl->join == 0' failed.
210629  3:06:24 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fd05d8f2f36 in __GI___assert_fail (assertion=0x5648db49b8e6 "sl->join == 0", file=0x5648db49b118 "/data/src/10.2/sql/sql_prepare.cc", line=3094, function=0x5648db49b8b0 "void reinit_stmt_before_use(THD*, LEX*)") at assert.c:101
#8  0x00005648daa480c1 in reinit_stmt_before_use (thd=0x7fd03c000d90, lex=0x7fd03c091188) at /data/src/10.2/sql/sql_prepare.cc:3094
#9  0x00005648da971e4b in sp_lex_keeper::reset_lex_and_exec_core (this=0x7fd03c11fad0, thd=0x7fd03c000d90, nextp=0x7fd0580ff5ac, open_tables=false, instr=0x7fd03c11fa90) at /data/src/10.2/sql/sp_head.cc:3075
#10 0x00005648da972517 in sp_instr_stmt::execute (this=0x7fd03c11fa90, thd=0x7fd03c000d90, nextp=0x7fd0580ff5ac) at /data/src/10.2/sql/sp_head.cc:3247
#11 0x00005648da96d519 in sp_head::execute (this=0x7fd03c037f98, thd=0x7fd03c000d90, merge_da_on_success=true) at /data/src/10.2/sql/sp_head.cc:1326
#12 0x00005648da96f737 in sp_head::execute_procedure (this=0x7fd03c037f98, thd=0x7fd03c000d90, args=0x7fd03c0056e0) at /data/src/10.2/sql/sp_head.cc:2202
#13 0x00005648daa1fa70 in do_execute_sp (thd=0x7fd03c000d90, sp=0x7fd03c037f98) at /data/src/10.2/sql/sql_parse.cc:2981
#14 0x00005648daa2841b in mysql_execute_command (thd=0x7fd03c000d90) at /data/src/10.2/sql/sql_parse.cc:5622
#15 0x00005648daa2e8de in mysql_parse (thd=0x7fd03c000d90, rawbuf=0x7fd03c012708 "CALL sp()", length=9, parser_state=0x7fd058100560, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:7793
#16 0x00005648daa1cb39 in dispatch_command (command=COM_QUERY, thd=0x7fd03c000d90, packet=0x7fd03c008b61 "CALL sp()", packet_length=9, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:1827
#17 0x00005648daa1b634 in do_command (thd=0x7fd03c000d90) at /data/src/10.2/sql/sql_parse.cc:1381
#18 0x00005648dab76ab7 in do_handle_one_connection (connect=0x5648dce6ca20) at /data/src/10.2/sql/sql_connect.cc:1336
#19 0x00005648dab7681c in handle_one_connection (arg=0x5648dce6ca20) at /data/src/10.2/sql/sql_connect.cc:1241
#20 0x00005648db3a2d40 in pfs_spawn_thread (arg=0x5648dce83090) at /data/src/10.2/storage/perfschema/pfs.cc:1869
#21 0x00007fd05de03609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#22 0x00007fd05d9de293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Igor Babaev [ 2021-07-20 ]

The following simple test demonstrate the problem:

create table t1 (a int, b int);
insert into t1 value (1,3), (3,2), (1,3), (4,1);
 
prepare stmt from "
with
  cte1 as ( select a,b from t1 where a = 1 AND b = 3 ),
  cte2 as ( select a,b from cte1 ),
  cte3 as ( select a,b from cte2 )
select * from cte3, cte2; ";
 
execute stmt;
execute stmt;
deallocate prepare stmt;

Execution of this test case causes a crash of the server.
The crash happens when running the first 'execute stmt;'.

Comment by Igor Babaev [ 2021-07-20 ]

An analysis shows that the SELECT created for the second usage of cte1 via the usage of cte2 is not included in all_select_list for the query. It happens for the same reason as the similar problem of the bug MDEV-26108 a fix for which was applied to 10.4. However if we apply this patch to 10.2 the above test case still crashes the server though now on the second execution of PS. So the fix for MDEV-26108 was not quite correct. A correct fix would be like this:

 diff --git a/sql/sql_cte.cc b/sql/sql_cte.cc
index 702db8f..b720eac 100644
--- a/sql/sql_cte.cc
+++ b/sql/sql_cte.cc
@@ -1012,6 +1012,7 @@ st_select_lex_unit *With_element::clone_parsed_spec(LEX *old_lex,
 
   bool parse_status= false;
   st_select_lex *with_select;
+  st_select_lex *last_clone_select;
 
   char save_end= unparsed_spec.str[unparsed_spec.length];
   unparsed_spec.str[unparsed_spec.length]= '\0';
@@ -1099,11 +1100,6 @@ st_select_lex_unit *With_element::clone_parsed_spec(LEX *old_lex,
   lex->unit.include_down(with_table->select_lex);
   lex->unit.set_slave(with_select);
   lex->unit.cloned_from= spec;
-  old_lex->all_selects_list=
-    (st_select_lex*) (lex->all_selects_list->
-                     insert_chain_before(
-                       (st_select_lex_node **) &(old_lex->all_selects_list),
-                        with_select));
 
   /*
     Now all references to the CTE defined outside of the cloned specification
@@ -1119,6 +1115,15 @@ st_select_lex_unit *With_element::clone_parsed_spec(LEX *old_lex,
     goto err;
   }
 
+  last_clone_select= lex->all_selects_list;
+  while (last_clone_select->next_select_in_list())
+    last_clone_select= last_clone_select->next_select_in_list();
+  old_lex->all_selects_list=
+    (st_select_lex*) (lex->all_selects_list->
+                     insert_chain_before(
+                       (st_select_lex_node **) &(old_lex->all_selects_list),
+                       last_clone_select));
+
  lex->sphead= NULL;    // in order not to delete lex->sphead

The difference between this patch and the patch for MDEV-26108 is that the added piece of code is added after the call of resolve_references_to_cte(), not before this call. This is important because the call of resolve_references_to_cte() can add new selects to the list lex->all_selects_list and all selects from this list should be inserted into the list
old_lex->all_selects_list.

Comment by Igor Babaev [ 2021-07-20 ]

The cause of both bugs is the same and they should be fixed by the same patch.

Comment by Oleksandr Byelkin [ 2021-07-20 ]

OK to push

Comment by Igor Babaev [ 2021-07-21 ]

A fix for this bug was pushed into 10.2. A special care should be taken when merging this patch into 10.4: the fix for MDEV-26108 should be reverted, but not the test cases added with this fix.

Generated at Thu Feb 08 09:42:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.