[MDEV-7040] Crash in field_conv, memcpy_field_possible, part#2 Created: 2014-11-07  Updated: 2015-12-08  Resolved: 2015-12-08

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.0.13
Fix Version/s: 10.0.20

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-6799 Crash in field_conv, memcpy_field_pos... Closed

 Description   

This issue is to track the fill fix for MDEV-6799, CSC#8034.

Fix for MDEV-6799 made the crash go away, but it didn't fix the root of the problem. I've made an incomplete fix, because I was unable to repeat the problem. Now, we've got a testcase that can crash 10.0.13 on my machine, so it's possible to work on fixing the root cause of the problem.



 Comments   
Comment by Sergei Petrunia [ 2014-11-07 ]

We've got a script that allows to repeat the problem: see CSC#8034, stress_minimal.sql.
I was using 10.0.13, compiled with "cmake . ; make " and I got:

./mysqld(my_print_stacktrace+0x24)[0xac2584]
./mysqld(handle_fatal_signal+0x3d8)[0x6ecf68]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f17fa45acb0]
./mysqld(_ZNK12Field_string4typeEv+0x1a)[0x6e605a]
./mysqld(_Z21memcpy_field_possibleP5FieldS0_+0x4c)[0x6e8f2c]
./mysqld(_Z10field_convP5FieldS0_+0x19)[0x6e9449]
./mysqld[0x6fe6c9]
./mysqld(_Z12sp_eval_exprP3THDP5FieldPP4Item+0x7e)[0x7d9b1e]
./mysqld(_ZN11sp_rcontext12set_variableEP3THDP5FieldPP4Item+0x17)[0x7e0cd7]
./mysqld(_ZN9sp_cursor24Select_fetch_into_spvars9send_dataER4ListI4ItemE+0x56)[0x7e0d56]
./mysqld(_ZN19Materialized_cursor5fetchEm+0x4c)[0x7ec1dc]
./mysqld(_ZN9sp_cursor5fetchEP3THDP4ListI11sp_variableE+0x4f)[0x7e110f]
./mysqld(_ZN15sp_instr_cfetch7executeEP3THDPj+0x2f)[0x7d7a3f]
./mysqld(_ZN7sp_head7executeEP3THDb+0x754)[0x7dac04]
./mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x581)[0x7dbf61]
./mysqld(_Z21mysql_execute_commandP3THD+0x3178)[0x5b5038]
./mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x169)[0x5b9bc9]
./mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x16f9)[0x5bb3a9]
./mysqld(_Z24do_handle_one_connectionP3THD+0x1e4)[0x668ae4]
./mysqld(handle_one_connection+0x40)[0x668b70]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f17fa452e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f17f93463fd]

Comment by Sergei Petrunia [ 2015-07-28 ]

So we are at the same stack trace (although not looking at the same field):

(gdb) wher
  #0  field_conv (to=0x7fff88026558, from=0x7fff88066858) at /home/psergey/dev-git/10.0/sql/field_conv.cc:865
  #1  0x000000000086f6e8 in save_field_in_field (from=0x7fff88066858, null_value=0x7fff88068787, to=0x7fff88026558, no_conversions=false) at /home/psergey/dev-git/10.0/sql/item.cc:5870
  #2  0x000000000086f958 in Item_field::save_in_field (this=0x7fff88068728, to=0x7fff88026558, no_conversions=false) at /home/psergey/dev-git/10.0/sql/item.cc:5926
  #3  0x00000000009883c6 in sp_eval_expr (thd=0x40c4530, result_field=0x7fff88026558, expr_item_ptr=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_head.cc:428
  #4  0x0000000000994d7a in sp_rcontext::set_variable (this=0x7fff880253a8, thd=0x40c4530, field=0x7fff88026558, value=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:377
  #5  0x00000000006359ba in sp_rcontext::set_variable (this=0x7fff880253a8, thd=0x40c4530, var_idx=0, value=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_rcontext.h:201
  #6  0x00000000009953c5 in sp_cursor::Select_fetch_into_spvars::send_data (this=0x7fff88027c00, items=...) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:553
  #7  0x00000000009a4486 in Materialized_cursor::fetch (this=0x7fff88067ef8, num_rows=1) at /home/psergey/dev-git/10.0/sql/sql_cursor.cc:339
  #8  0x0000000000995287 in sp_cursor::fetch (this=0x7fff88027bf8, thd=0x40c4530, vars=0x7fff88042418) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:504
  #9  0x00000000009901bc in sp_instr_cfetch::execute (this=0x7fff880423e0, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3869
  #10 0x000000000098a6b1 in sp_head::execute (this=0x7fff8801e058, thd=0x40c4530, merge_da_on_success=true) at /home/psergey/dev-git/10.0/sql/sp_head.cc:1368
  #11 0x000000000098c318 in sp_head::execute_procedure (this=0x7fff8801e058, thd=0x40c4530, args=0x40c8ee0) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2154
  #12 0x00000000006692d6 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:4692
  #13 0x000000000066d717 in mysql_parse (thd=0x40c4530, rawbuf=0x7fff88006478 "call stress_7()", length=15, parser_state=0x7fffc4053590) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:6415

(gdb) p from->table
  $174 = (TABLE *) 0x7fff880651e8
(gdb) p from->orig_table
  $175 = (TABLE *) 0x7fff8802b510

from->orig_table comes from here:

  Breakpoint 11, ha_myisam::open (this=0x7fff8802bf38, name=0x7fff88028f68 "./db/stress_data_7", mode=2, test_if_locked=18) at /home/psergey/dev-git/10.0/storage/myisam/ha_myisam.cc:733
(gdb) p this->table
  $163 = (TABLE *) 0x7fff8802b510

from->table comes from here:

  Breakpoint 15, open_tmp_table (table=0x7fff880651e8) at /home/psergey/dev-git/10.0/sql/sql_select.cc:16608
(gdb) next
  Breakpoint 12, ha_heap::open (this=0x7fff880671a8, name=0x7fff88066780 "/data0/psergey/tmp/#sql_5355_0", mode=2, test_if_locked=516) at /home/psergey/dev-git/10.0/storage/heap/ha_heap.cc:93
(gdb) p this->table
  $166 = (TABLE *) 0x7fff880651e8

So, could it be possible that the original ha_myisam table is already closed when FETCH statement is running?

On one hand, SPs use "pre-locking" feature: if a statement uses SPs, the runtime will collect a list of tables it needs to lock, and then lock/unlock all of them at once..

On the other hand, CALL sp may be the only statement that's exempt from mandatory prelocking due to need to have higher concurrency for replication.. checking...

Comment by Sergei Petrunia [ 2015-07-28 ]

Debugged and verified: OPEN statement closes tables and releases them here:

  #0  tc_release_table (table=0x7fff8805aba0) at /home/psergey/dev-git/10.0/sql/table_cache.cc:400
  #1  0x0000000000600852 in close_thread_table (thd=0x40c4530, table_ptr=0x40c45f0) at /home/psergey/dev-git/10.0/sql/sql_base.cc:1056
  #2  0x0000000000600067 in close_open_tables (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_base.cc:779
  #3  0x00000000006005df in close_thread_tables (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_base.cc:1003
  #4  0x000000000066a9e7 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:5143
  #5  0x00000000009a3ca6 in mysql_open_cursor (thd=0x40c4530, result=0x7fff88027c00, pcursor=0x7fff88027c30) at /home/psergey/dev-git/10.0/sql/sql_cursor.cc:123
  #6  0x000000000099505c in sp_cursor::open (this=0x7fff88027bf8, thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:456
  #7  0x000000000098feed in sp_instr_copen::exec_core (this=0x7fff8803c4d8, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3788
  #8  0x000000000098e2c1 in sp_lex_keeper::reset_lex_and_exec_core (this=0x7fff8803acd8, thd=0x40c4530, nextp=0x7fffc4052808, open_tables=false, instr=0x7fff8803c4d8) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2966
  #9  0x000000000098fe52 in sp_instr_copen::execute (this=0x7fff8803c4d8, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3773
  #10 0x000000000098a6b1 in sp_head::execute (this=0x7fff8801e058, thd=0x40c4530, merge_da_on_success=true) at /home/psergey/dev-git/10.0/sql/sp_head.cc:1368
  #11 0x000000000098c318 in sp_head::execute_procedure (this=0x7fff8801e058, thd=0x40c4530, args=0x40c8ee0) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2154
  #12 0x00000000006692d6 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:4692
  #13 0x000000000066d717 in mysql_parse (thd=0x40c4530, rawbuf=0x7fff88006478 "call stress_7()", length=15, parser_state=0x7fffc4053590) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:6415

when FETCH is running, it is possible that the TABLE object is evicted from the cache already.

Next points:

  • Should try to construct a predictable testcase using DEBUG_SYNC feature.
  • It is not still clear how Field objects are passed between the statements in the SP and where table and orig_table are set.
Comment by Sergei Petrunia [ 2015-07-29 ]

Because of the above, one can reliably cause a crash with these commands:

set global table_open_cache=1;
set global table_definition_cache=1;
call stress_7();

Comment by Sergei Petrunia [ 2015-07-29 ]

http://lists.askmonty.org/pipermail/commits/2015-July/008207.html

sanja, please review.

Comment by Sergei Petrunia [ 2015-12-08 ]

The fix was pushed into 10.0.20 :
```
commit b74795b00cf36b5d8d733e0056a64f2490fd51ea
Author: Sergei Petrunia <psergey@askmonty.org>
Date: Wed Jul 29 21:38:45 2015 +0300

MDEV-7040: Crash in field_conv, memcpy_field_possible, part#2

The problem was with Materialized_cursor and temporary table it uses.
...
```

Generated at Thu Feb 08 07:16:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.