Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7040

Crash in field_conv, memcpy_field_possible, part#2

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.13
    • 10.0.20
    • Optimizer
    • None

    Description

      This issue is to track the fill fix for MDEV-6799, CSC#8034.

      Fix for MDEV-6799 made the crash go away, but it didn't fix the root of the problem. I've made an incomplete fix, because I was unable to repeat the problem. Now, we've got a testcase that can crash 10.0.13 on my machine, so it's possible to work on fixing the root cause of the problem.

      Attachments

        Issue Links

          Activity

            We've got a script that allows to repeat the problem: see CSC#8034, stress_minimal.sql.
            I was using 10.0.13, compiled with "cmake . ; make " and I got:

            ./mysqld(my_print_stacktrace+0x24)[0xac2584]
            ./mysqld(handle_fatal_signal+0x3d8)[0x6ecf68]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f17fa45acb0]
            ./mysqld(_ZNK12Field_string4typeEv+0x1a)[0x6e605a]
            ./mysqld(_Z21memcpy_field_possibleP5FieldS0_+0x4c)[0x6e8f2c]
            ./mysqld(_Z10field_convP5FieldS0_+0x19)[0x6e9449]
            ./mysqld[0x6fe6c9]
            ./mysqld(_Z12sp_eval_exprP3THDP5FieldPP4Item+0x7e)[0x7d9b1e]
            ./mysqld(_ZN11sp_rcontext12set_variableEP3THDP5FieldPP4Item+0x17)[0x7e0cd7]
            ./mysqld(_ZN9sp_cursor24Select_fetch_into_spvars9send_dataER4ListI4ItemE+0x56)[0x7e0d56]
            ./mysqld(_ZN19Materialized_cursor5fetchEm+0x4c)[0x7ec1dc]
            ./mysqld(_ZN9sp_cursor5fetchEP3THDP4ListI11sp_variableE+0x4f)[0x7e110f]
            ./mysqld(_ZN15sp_instr_cfetch7executeEP3THDPj+0x2f)[0x7d7a3f]
            ./mysqld(_ZN7sp_head7executeEP3THDb+0x754)[0x7dac04]
            ./mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x581)[0x7dbf61]
            ./mysqld(_Z21mysql_execute_commandP3THD+0x3178)[0x5b5038]
            ./mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x169)[0x5b9bc9]
            ./mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x16f9)[0x5bb3a9]
            ./mysqld(_Z24do_handle_one_connectionP3THD+0x1e4)[0x668ae4]
            ./mysqld(handle_one_connection+0x40)[0x668b70]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f17fa452e9a]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f17f93463fd]

            psergei Sergei Petrunia added a comment - We've got a script that allows to repeat the problem: see CSC#8034, stress_minimal.sql. I was using 10.0.13, compiled with "cmake . ; make " and I got: ./mysqld(my_print_stacktrace+0x24)[0xac2584] ./mysqld(handle_fatal_signal+0x3d8)[0x6ecf68] /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f17fa45acb0] ./mysqld(_ZNK12Field_string4typeEv+0x1a)[0x6e605a] ./mysqld(_Z21memcpy_field_possibleP5FieldS0_+0x4c)[0x6e8f2c] ./mysqld(_Z10field_convP5FieldS0_+0x19)[0x6e9449] ./mysqld[0x6fe6c9] ./mysqld(_Z12sp_eval_exprP3THDP5FieldPP4Item+0x7e)[0x7d9b1e] ./mysqld(_ZN11sp_rcontext12set_variableEP3THDP5FieldPP4Item+0x17)[0x7e0cd7] ./mysqld(_ZN9sp_cursor24Select_fetch_into_spvars9send_dataER4ListI4ItemE+0x56)[0x7e0d56] ./mysqld(_ZN19Materialized_cursor5fetchEm+0x4c)[0x7ec1dc] ./mysqld(_ZN9sp_cursor5fetchEP3THDP4ListI11sp_variableE+0x4f)[0x7e110f] ./mysqld(_ZN15sp_instr_cfetch7executeEP3THDPj+0x2f)[0x7d7a3f] ./mysqld(_ZN7sp_head7executeEP3THDb+0x754)[0x7dac04] ./mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x581)[0x7dbf61] ./mysqld(_Z21mysql_execute_commandP3THD+0x3178)[0x5b5038] ./mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x169)[0x5b9bc9] ./mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x16f9)[0x5bb3a9] ./mysqld(_Z24do_handle_one_connectionP3THD+0x1e4)[0x668ae4] ./mysqld(handle_one_connection+0x40)[0x668b70] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f17fa452e9a] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f17f93463fd]
            psergei Sergei Petrunia added a comment - - edited

            So we are at the same stack trace (although not looking at the same field):

            (gdb) wher
              #0  field_conv (to=0x7fff88026558, from=0x7fff88066858) at /home/psergey/dev-git/10.0/sql/field_conv.cc:865
              #1  0x000000000086f6e8 in save_field_in_field (from=0x7fff88066858, null_value=0x7fff88068787, to=0x7fff88026558, no_conversions=false) at /home/psergey/dev-git/10.0/sql/item.cc:5870
              #2  0x000000000086f958 in Item_field::save_in_field (this=0x7fff88068728, to=0x7fff88026558, no_conversions=false) at /home/psergey/dev-git/10.0/sql/item.cc:5926
              #3  0x00000000009883c6 in sp_eval_expr (thd=0x40c4530, result_field=0x7fff88026558, expr_item_ptr=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_head.cc:428
              #4  0x0000000000994d7a in sp_rcontext::set_variable (this=0x7fff880253a8, thd=0x40c4530, field=0x7fff88026558, value=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:377
              #5  0x00000000006359ba in sp_rcontext::set_variable (this=0x7fff880253a8, thd=0x40c4530, var_idx=0, value=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_rcontext.h:201
              #6  0x00000000009953c5 in sp_cursor::Select_fetch_into_spvars::send_data (this=0x7fff88027c00, items=...) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:553
              #7  0x00000000009a4486 in Materialized_cursor::fetch (this=0x7fff88067ef8, num_rows=1) at /home/psergey/dev-git/10.0/sql/sql_cursor.cc:339
              #8  0x0000000000995287 in sp_cursor::fetch (this=0x7fff88027bf8, thd=0x40c4530, vars=0x7fff88042418) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:504
              #9  0x00000000009901bc in sp_instr_cfetch::execute (this=0x7fff880423e0, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3869
              #10 0x000000000098a6b1 in sp_head::execute (this=0x7fff8801e058, thd=0x40c4530, merge_da_on_success=true) at /home/psergey/dev-git/10.0/sql/sp_head.cc:1368
              #11 0x000000000098c318 in sp_head::execute_procedure (this=0x7fff8801e058, thd=0x40c4530, args=0x40c8ee0) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2154
              #12 0x00000000006692d6 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:4692
              #13 0x000000000066d717 in mysql_parse (thd=0x40c4530, rawbuf=0x7fff88006478 "call stress_7()", length=15, parser_state=0x7fffc4053590) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:6415

            (gdb) p from->table
              $174 = (TABLE *) 0x7fff880651e8
            (gdb) p from->orig_table
              $175 = (TABLE *) 0x7fff8802b510

            from->orig_table comes from here:

              Breakpoint 11, ha_myisam::open (this=0x7fff8802bf38, name=0x7fff88028f68 "./db/stress_data_7", mode=2, test_if_locked=18) at /home/psergey/dev-git/10.0/storage/myisam/ha_myisam.cc:733
            (gdb) p this->table
              $163 = (TABLE *) 0x7fff8802b510

            from->table comes from here:

              Breakpoint 15, open_tmp_table (table=0x7fff880651e8) at /home/psergey/dev-git/10.0/sql/sql_select.cc:16608
            (gdb) next
              Breakpoint 12, ha_heap::open (this=0x7fff880671a8, name=0x7fff88066780 "/data0/psergey/tmp/#sql_5355_0", mode=2, test_if_locked=516) at /home/psergey/dev-git/10.0/storage/heap/ha_heap.cc:93
            (gdb) p this->table
              $166 = (TABLE *) 0x7fff880651e8

            So, could it be possible that the original ha_myisam table is already closed when FETCH statement is running?

            On one hand, SPs use "pre-locking" feature: if a statement uses SPs, the runtime will collect a list of tables it needs to lock, and then lock/unlock all of them at once..

            On the other hand, CALL sp may be the only statement that's exempt from mandatory prelocking due to need to have higher concurrency for replication.. checking...

            psergei Sergei Petrunia added a comment - - edited So we are at the same stack trace (although not looking at the same field): (gdb) wher #0 field_conv (to=0x7fff88026558, from=0x7fff88066858) at /home/psergey/dev-git/10.0/sql/field_conv.cc:865 #1 0x000000000086f6e8 in save_field_in_field (from=0x7fff88066858, null_value=0x7fff88068787, to=0x7fff88026558, no_conversions=false) at /home/psergey/dev-git/10.0/sql/item.cc:5870 #2 0x000000000086f958 in Item_field::save_in_field (this=0x7fff88068728, to=0x7fff88026558, no_conversions=false) at /home/psergey/dev-git/10.0/sql/item.cc:5926 #3 0x00000000009883c6 in sp_eval_expr (thd=0x40c4530, result_field=0x7fff88026558, expr_item_ptr=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_head.cc:428 #4 0x0000000000994d7a in sp_rcontext::set_variable (this=0x7fff880253a8, thd=0x40c4530, field=0x7fff88026558, value=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:377 #5 0x00000000006359ba in sp_rcontext::set_variable (this=0x7fff880253a8, thd=0x40c4530, var_idx=0, value=0x7fffc4052460) at /home/psergey/dev-git/10.0/sql/sp_rcontext.h:201 #6 0x00000000009953c5 in sp_cursor::Select_fetch_into_spvars::send_data (this=0x7fff88027c00, items=...) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:553 #7 0x00000000009a4486 in Materialized_cursor::fetch (this=0x7fff88067ef8, num_rows=1) at /home/psergey/dev-git/10.0/sql/sql_cursor.cc:339 #8 0x0000000000995287 in sp_cursor::fetch (this=0x7fff88027bf8, thd=0x40c4530, vars=0x7fff88042418) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:504 #9 0x00000000009901bc in sp_instr_cfetch::execute (this=0x7fff880423e0, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3869 #10 0x000000000098a6b1 in sp_head::execute (this=0x7fff8801e058, thd=0x40c4530, merge_da_on_success=true) at /home/psergey/dev-git/10.0/sql/sp_head.cc:1368 #11 0x000000000098c318 in sp_head::execute_procedure (this=0x7fff8801e058, thd=0x40c4530, args=0x40c8ee0) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2154 #12 0x00000000006692d6 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:4692 #13 0x000000000066d717 in mysql_parse (thd=0x40c4530, rawbuf=0x7fff88006478 "call stress_7()", length=15, parser_state=0x7fffc4053590) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:6415 (gdb) p from->table $174 = (TABLE *) 0x7fff880651e8 (gdb) p from->orig_table $175 = (TABLE *) 0x7fff8802b510 from->orig_table comes from here: Breakpoint 11, ha_myisam::open (this=0x7fff8802bf38, name=0x7fff88028f68 "./db/stress_data_7", mode=2, test_if_locked=18) at /home/psergey/dev-git/10.0/storage/myisam/ha_myisam.cc:733 (gdb) p this->table $163 = (TABLE *) 0x7fff8802b510 from->table comes from here: Breakpoint 15, open_tmp_table (table=0x7fff880651e8) at /home/psergey/dev-git/10.0/sql/sql_select.cc:16608 (gdb) next Breakpoint 12, ha_heap::open (this=0x7fff880671a8, name=0x7fff88066780 "/data0/psergey/tmp/#sql_5355_0", mode=2, test_if_locked=516) at /home/psergey/dev-git/10.0/storage/heap/ha_heap.cc:93 (gdb) p this->table $166 = (TABLE *) 0x7fff880651e8 So, could it be possible that the original ha_myisam table is already closed when FETCH statement is running? On one hand, SPs use "pre-locking" feature: if a statement uses SPs, the runtime will collect a list of tables it needs to lock, and then lock/unlock all of them at once.. On the other hand, CALL sp may be the only statement that's exempt from mandatory prelocking due to need to have higher concurrency for replication.. checking...

            Debugged and verified: OPEN statement closes tables and releases them here:

              #0  tc_release_table (table=0x7fff8805aba0) at /home/psergey/dev-git/10.0/sql/table_cache.cc:400
              #1  0x0000000000600852 in close_thread_table (thd=0x40c4530, table_ptr=0x40c45f0) at /home/psergey/dev-git/10.0/sql/sql_base.cc:1056
              #2  0x0000000000600067 in close_open_tables (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_base.cc:779
              #3  0x00000000006005df in close_thread_tables (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_base.cc:1003
              #4  0x000000000066a9e7 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:5143
              #5  0x00000000009a3ca6 in mysql_open_cursor (thd=0x40c4530, result=0x7fff88027c00, pcursor=0x7fff88027c30) at /home/psergey/dev-git/10.0/sql/sql_cursor.cc:123
              #6  0x000000000099505c in sp_cursor::open (this=0x7fff88027bf8, thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:456
              #7  0x000000000098feed in sp_instr_copen::exec_core (this=0x7fff8803c4d8, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3788
              #8  0x000000000098e2c1 in sp_lex_keeper::reset_lex_and_exec_core (this=0x7fff8803acd8, thd=0x40c4530, nextp=0x7fffc4052808, open_tables=false, instr=0x7fff8803c4d8) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2966
              #9  0x000000000098fe52 in sp_instr_copen::execute (this=0x7fff8803c4d8, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3773
              #10 0x000000000098a6b1 in sp_head::execute (this=0x7fff8801e058, thd=0x40c4530, merge_da_on_success=true) at /home/psergey/dev-git/10.0/sql/sp_head.cc:1368
              #11 0x000000000098c318 in sp_head::execute_procedure (this=0x7fff8801e058, thd=0x40c4530, args=0x40c8ee0) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2154
              #12 0x00000000006692d6 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:4692
              #13 0x000000000066d717 in mysql_parse (thd=0x40c4530, rawbuf=0x7fff88006478 "call stress_7()", length=15, parser_state=0x7fffc4053590) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:6415

            when FETCH is running, it is possible that the TABLE object is evicted from the cache already.

            Next points:

            • Should try to construct a predictable testcase using DEBUG_SYNC feature.
            • It is not still clear how Field objects are passed between the statements in the SP and where table and orig_table are set.
            psergei Sergei Petrunia added a comment - Debugged and verified: OPEN statement closes tables and releases them here: #0 tc_release_table (table=0x7fff8805aba0) at /home/psergey/dev-git/10.0/sql/table_cache.cc:400 #1 0x0000000000600852 in close_thread_table (thd=0x40c4530, table_ptr=0x40c45f0) at /home/psergey/dev-git/10.0/sql/sql_base.cc:1056 #2 0x0000000000600067 in close_open_tables (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_base.cc:779 #3 0x00000000006005df in close_thread_tables (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_base.cc:1003 #4 0x000000000066a9e7 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:5143 #5 0x00000000009a3ca6 in mysql_open_cursor (thd=0x40c4530, result=0x7fff88027c00, pcursor=0x7fff88027c30) at /home/psergey/dev-git/10.0/sql/sql_cursor.cc:123 #6 0x000000000099505c in sp_cursor::open (this=0x7fff88027bf8, thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sp_rcontext.cc:456 #7 0x000000000098feed in sp_instr_copen::exec_core (this=0x7fff8803c4d8, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3788 #8 0x000000000098e2c1 in sp_lex_keeper::reset_lex_and_exec_core (this=0x7fff8803acd8, thd=0x40c4530, nextp=0x7fffc4052808, open_tables=false, instr=0x7fff8803c4d8) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2966 #9 0x000000000098fe52 in sp_instr_copen::execute (this=0x7fff8803c4d8, thd=0x40c4530, nextp=0x7fffc4052808) at /home/psergey/dev-git/10.0/sql/sp_head.cc:3773 #10 0x000000000098a6b1 in sp_head::execute (this=0x7fff8801e058, thd=0x40c4530, merge_da_on_success=true) at /home/psergey/dev-git/10.0/sql/sp_head.cc:1368 #11 0x000000000098c318 in sp_head::execute_procedure (this=0x7fff8801e058, thd=0x40c4530, args=0x40c8ee0) at /home/psergey/dev-git/10.0/sql/sp_head.cc:2154 #12 0x00000000006692d6 in mysql_execute_command (thd=0x40c4530) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:4692 #13 0x000000000066d717 in mysql_parse (thd=0x40c4530, rawbuf=0x7fff88006478 "call stress_7()", length=15, parser_state=0x7fffc4053590) at /home/psergey/dev-git/10.0/sql/sql_parse.cc:6415 when FETCH is running, it is possible that the TABLE object is evicted from the cache already. Next points: Should try to construct a predictable testcase using DEBUG_SYNC feature. It is not still clear how Field objects are passed between the statements in the SP and where table and orig_table are set.

            Because of the above, one can reliably cause a crash with these commands:

            set global table_open_cache=1;
            set global table_definition_cache=1;
            call stress_7();

            psergei Sergei Petrunia added a comment - Because of the above, one can reliably cause a crash with these commands: set global table_open_cache=1; set global table_definition_cache=1; call stress_7();
            psergei Sergei Petrunia added a comment - http://lists.askmonty.org/pipermail/commits/2015-July/008207.html sanja , please review.

            The fix was pushed into 10.0.20 :
            ```
            commit b74795b00cf36b5d8d733e0056a64f2490fd51ea
            Author: Sergei Petrunia <psergey@askmonty.org>
            Date: Wed Jul 29 21:38:45 2015 +0300

            MDEV-7040: Crash in field_conv, memcpy_field_possible, part#2

            The problem was with Materialized_cursor and temporary table it uses.
            ...
            ```

            psergei Sergei Petrunia added a comment - The fix was pushed into 10.0.20 : ``` commit b74795b00cf36b5d8d733e0056a64f2490fd51ea Author: Sergei Petrunia <psergey@askmonty.org> Date: Wed Jul 29 21:38:45 2015 +0300 MDEV-7040 : Crash in field_conv, memcpy_field_possible, part#2 The problem was with Materialized_cursor and temporary table it uses. ... ```

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.