Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20307

Server crashes with thread_stack=131072, sometimes without anything in error log

Details

    Description

      Maybe the minimal allowed value thread_stack=131072 is not enough anymore, and needs to be increased.

      # Run with 
      --mysqld=--innodb-encrypt-tables  --mysqld=--innodb-encryption-threads=4  --mysqld=--file-key-management --mysqld=--file-key-management-filename=`pwd`/std_data/keys.txt --mysqld=--plugin-load-add=file_key_management --mysqld=--thread_stack=131072
      

      --source include/have_innodb.inc
       
      --connect (con1,localhost,root,,test)
      CREATE TABLE t1 (c MEDIUMBLOB) ENGINE=InnoDB;
      REPLACE INTO t1 () VALUES (),();
       
      --connection default
      OPTIMIZE TABLE t1;
       
      # Cleanup
      --disconnect con1
      --connection default
      DROP TABLE t1;
      

      Sometimes it crashes without coredump and with only generic empty crash report in the error log. Sometimes it crashes with a coredump, but without anything at all in the error log, not even signal 11 or Killed.

      Here is a stack trace from an occurrence when it crashed with a coredump:

      10.4 13f36fff -DCMAKE_BULD_TYPE=Debug

      #0  0x000055fc3f8b6f1a in fil_space_encrypt (space=<error reading variable: Cannot access memory at address 0x7f726033c368>, offset=<error reading variable: Cannot access memory at address 0x7f726033c360>, lsn=<error reading variable: Cannot access memory at address 0x7f726033c358>, src_frame=<error reading variable: Cannot access memory at address 0x7f726033c350>, dst_frame=<error reading variable: Cannot access memory at address 0x7f726033c348>) at /data/src/10.4/storage/innobase/fil/fil0crypt.cc:719
      #1  0x000055fc3f8280bf in buf_page_encrypt (space=0x7f71d8030120, bpage=0x7f723c043868, src_frame=0x7f723cba8000 "\n\035]^") at /data/src/10.4/storage/innobase/buf/buf0buf.cc:7554
      #2  0x000055fc3f83717f in buf_flush_write_block_low (bpage=0x7f723c043868, flush_type=BUF_FLUSH_SINGLE_PAGE, sync=false) at /data/src/10.4/storage/innobase/buf/buf0flu.cc:1038
      #3  0x000055fc3f8379ff in buf_flush_page (buf_pool=0x55fc41e3da80, bpage=0x7f723c043868, flush_type=BUF_FLUSH_SINGLE_PAGE, sync=false) at /data/src/10.4/storage/innobase/buf/buf0flu.cc:1208
      #4  0x000055fc3f843457 in buf_flush_or_remove_page (buf_pool=0x55fc41e3da80, bpage=0x7f723c043868, flush=true) at /data/src/10.4/storage/innobase/buf/buf0lru.cc:528
      #5  0x000055fc3f8436ea in buf_flush_or_remove_pages (buf_pool=0x55fc41e3da80, id=55, observer=0x7f71d80116e0, first=0) at /data/src/10.4/storage/innobase/buf/buf0lru.cc:594
      #6  0x000055fc3f843893 in buf_flush_dirty_pages (buf_pool=0x55fc41e3da80, id=55, observer=0x7f71d80116e0, first=0) at /data/src/10.4/storage/innobase/buf/buf0lru.cc:686
      #7  0x000055fc3f843abf in buf_LRU_flush_or_remove_pages (id=55, observer=0x7f71d80116e0, first=0) at /data/src/10.4/storage/innobase/buf/buf0lru.cc:718
      #8  0x000055fc3f83f7df in FlushObserver::flush (this=0x7f71d80116e0) at /data/src/10.4/storage/innobase/buf/buf0flu.cc:3778
      #9  0x000055fc3f6e6c25 in row_merge_build_indexes (trx=0x7f724b3ff390, old_table=0x7f71e0060038, new_table=0x7f71d802d6f8, online=true, indexes=0x7f71d802d0a0, key_numbers=0x7f71d802d0a8, n_indexes=1, table=0x7f726035d6b0, defaults=0x0, col_map=0x7f71d802d0f8, add_autoinc=18446744073709551615, sequence=..., skip_pk_sort=true, stage=0x7f71d8030980, add_v=0x0, eval_table=0x7f726035d6b0, allow_not_null=false) at /data/src/10.4/storage/innobase/row/row0merge.cc:5048
      #10 0x000055fc3f5c93b9 in ha_innobase::inplace_alter_table (this=0x7f71e03db268, altered_table=0x7f726035d6b0, ha_alter_info=0x7f726035d620) at /data/src/10.4/storage/innobase/handler/handler0alter.cc:8385
      #11 0x000055fc3f10811d in handler::ha_inplace_alter_table (this=0x7f71e03db268, altered_table=0x7f726035d6b0, ha_alter_info=0x7f726035d620) at /data/src/10.4/sql/handler.h:4324
      #12 0x000055fc3f0fc87f in mysql_inplace_alter_table (thd=0x7f71d8000b00, table_list=0x7f71d8011db8, table=0x7f71e03da400, altered_table=0x7f726035d6b0, ha_alter_info=0x7f726035d620, inplace_supported=HA_ALTER_INPLACE_COPY_NO_LOCK, target_mdl_request=0x7f726035e480, alter_ctx=0x7f726035efb0) at /data/src/10.4/sql/sql_table.cc:7666
      #13 0x000055fc3f10339b in mysql_alter_table (thd=0x7f71d8000b00, new_db=0x55fc3fc17a70 <null_clex_str>, new_name=0x55fc3fc17a70 <null_clex_str>, create_info=0x7f726035fb80, table_list=0x7f71d8011db8, alter_info=0x7f726035fac0, order_num=0, order=0x0, ignore=false) at /data/src/10.4/sql/sql_table.cc:10028
      #14 0x000055fc3f1061c4 in mysql_recreate_table (thd=0x7f71d8000b00, table_list=0x7f71d8011db8, table_copy=false) at /data/src/10.4/sql/sql_table.cc:10871
      #15 0x000055fc3f19e090 in admin_recreate_table (thd=0x7f71d8000b00, table_list=0x7f71d8011db8) at /data/src/10.4/sql/sql_admin.cc:59
      #16 0x000055fc3f1a13ad in mysql_admin_table(THD *, TABLE_LIST *, HA_CHECK_OPT *, const char *, thr_lock_type, bool, bool, uint, int (*)(THD *, TABLE_LIST *, HA_CHECK_OPT *), struct {...}, int (*)(THD *, TABLE_LIST *, HA_CHECK_OPT *)) (thd=0x7f71d8000b00, tables=0x7f71d8011db8, check_opt=0x7f71d8005cd0, operator_name=0x55fc3fc60ab8 "optimize", lock_type=TL_WRITE, org_open_for_modify=true, repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x55fc3f35ffbe <handler::ha_optimize(THD*, st_ha_check_opt*)>, view_operator_func=0x0) at /data/src/10.4/sql/sql_admin.cc:1030
      #17 0x000055fc3f1a27b2 in Sql_cmd_optimize_table::execute (this=0x7f71d8012470, thd=0x7f71d8000b00) at /data/src/10.4/sql/sql_admin.cc:1374
      #18 0x000055fc3f02230f in mysql_execute_command (thd=0x7f71d8000b00) at /data/src/10.4/sql/sql_parse.cc:6098
      #19 0x000055fc3f02757f in mysql_parse (thd=0x7f71d8000b00, rawbuf=0x7f71d8011cf8 "OPTIMIZE TABLE t1", length=17, parser_state=0x7f72603611c0, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7908
      #20 0x000055fc3f013828 in dispatch_command (command=COM_QUERY, thd=0x7f71d8000b00, packet=0x7f71d8008231 "", packet_length=17, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1843
      #21 0x000055fc3f011f6e in do_command (thd=0x7f71d8000b00) at /data/src/10.4/sql/sql_parse.cc:1360
      #22 0x000055fc3f18bb02 in do_handle_one_connection (connect=0x55fc429384c0) at /data/src/10.4/sql/sql_connect.cc:1404
      #23 0x000055fc3f18b851 in handle_one_connection (arg=0x55fc429384c0) at /data/src/10.4/sql/sql_connect.cc:1306
      #24 0x00007f7262b1a4a4 in start_thread (arg=0x7f7260362700) at pthread_create.c:456
      #25 0x00007f7261062d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
      

      Here is ASAN crash:

      10.4 13f36fff ASAN

      ASAN:DEADLYSIGNAL
      =================================================================
      ==11699==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f3d32f6acf4 bp 0x7f3d193a0460 sp 0x7f3d193a03d0 T30)
          #0 0x7f3d32f6acf3  (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x24cf3)
          #1 0x7f3d33007d05 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1d05)
          #2 0x55884cdba747 in mem_heap_create_block_func(mem_block_info_t*, unsigned long, char const*, unsigned int, unsigned long) /data/src/10.4/storage/innobase/mem/mem0mem.cc:269
          #3 0x55884cdbae74 in mem_heap_add_block(mem_block_info_t*, unsigned long) /data/src/10.4/storage/innobase/mem/mem0mem.cc:375
          #4 0x55884cf2eeb8 in mem_heap_alloc /data/src/10.4/storage/innobase/include/mem0mem.ic:203
          #5 0x55884cf372a5 in sel_node_create(mem_block_info_t*) /data/src/10.4/storage/innobase/row/row0sel.cc:359
          #6 0x55884ce27d81 in pars_select_list(void*, sym_node_t*) /data/src/10.4/storage/innobase/pars/pars0pars.cc:897
          #7 0x55884ce29281 in pars_update_statement(upd_node_t*, sym_node_t*, void*) /data/src/10.4/storage/innobase/pars/pars0pars.cc:1240
          #8 0x55884d2bf150 in yyparse() /dev/shm/tmp_build/storage/innobase/pars0grm.y:438
          #9 0x55884ce2bfa6 in pars_sql(pars_info_t*, char const*) /data/src/10.4/storage/innobase/pars/pars0pars.cc:2131
          #10 0x55884ce362f7 in que_eval_sql(pars_info_t*, char const*, bool, trx_t*) /data/src/10.4/storage/innobase/que/que0que.cc:1208
          #11 0x55884cee2729 in row_drop_table_for_mysql(char const*, trx_t*, enum_sql_command, bool, bool) /data/src/10.4/storage/innobase/row/row0mysql.cc:3667
          #12 0x55884cec1372 in row_merge_drop_table(trx_t*, dict_table_t*) /data/src/10.4/storage/innobase/row/row0merge.cc:4499
          #13 0x55884cccf7f7 in ha_innobase::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool) /data/src/10.4/storage/innobase/handler/handler0alter.cc:11338
          #14 0x55884c729c97 in handler::ha_commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool) /data/src/10.4/sql/handler.cc:4575
          #15 0x55884c1ef745 in mysql_inplace_alter_table /data/src/10.4/sql/sql_table.cc:7717
          #16 0x55884c1fd5ea in mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool) /data/src/10.4/sql/sql_table.cc:10028
          #17 0x55884c202e00 in mysql_recreate_table(THD*, TABLE_LIST*, bool) /data/src/10.4/sql/sql_table.cc:10871
          #18 0x55884c360227 in admin_recreate_table /data/src/10.4/sql/sql_admin.cc:59
          #19 0x55884c36714b in mysql_admin_table /data/src/10.4/sql/sql_admin.cc:1030
          #20 0x55884c369d03 in Sql_cmd_optimize_table::execute(THD*) /data/src/10.4/sql/sql_admin.cc:1374
          #21 0x55884bfe25b8 in mysql_execute_command(THD*) /data/src/10.4/sql/sql_parse.cc:6098
          #22 0x55884bfeca3d in mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) /data/src/10.4/sql/sql_parse.cc:7908
          #23 0x55884bfc6f71 in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) /data/src/10.4/sql/sql_parse.cc:1843
          #24 0x55884bfc3e7f in do_command(THD*) /data/src/10.4/sql/sql_parse.cc:1360
          #25 0x55884c338a38 in do_handle_one_connection(CONNECT*) /data/src/10.4/sql/sql_connect.cc:1404
          #26 0x55884c3383ec in handle_one_connection /data/src/10.4/sql/sql_connect.cc:1306
          #27 0x55884d6b76d9 in pfs_spawn_thread /data/src/10.4/storage/perfschema/pfs.cc:1862
          #28 0x7f3d32d304a3 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x74a3)
          #29 0x7f3d31278d0e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xe8d0e)
       
      AddressSanitizer can not provide additional info.
      SUMMARY: AddressSanitizer: SEGV (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x24cf3) 
      Thread T30 created by T0 here:
          #0 0x7f3d32f76f59 in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x30f59)
          #1 0x55884d6b7ac6 in spawn_thread_v1 /data/src/10.4/storage/perfschema/pfs.cc:1912
          #2 0x55884bd2bef8 in inline_mysql_thread_create /data/src/10.4/include/mysql/psi/mysql_thread.h:1268
          #3 0x55884bd4002a in create_thread_to_handle_connection(CONNECT*) /data/src/10.4/sql/mysqld.cc:6238
          #4 0x55884bd4070d in create_new_thread(CONNECT*) /data/src/10.4/sql/mysqld.cc:6308
          #5 0x55884bd40a98 in handle_accepted_socket(st_mysql_socket, st_mysql_socket) /data/src/10.4/sql/mysqld.cc:6406
          #6 0x55884bd416ea in handle_connections_sockets() /data/src/10.4/sql/mysqld.cc:6564
          #7 0x55884bd3f8ab in mysqld_main(int, char**) /data/src/10.4/sql/mysqld.cc:5896
          #8 0x55884bd29ddf in main /data/src/10.4/sql/main.cc:25
          #9 0x7f3d311b02e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
       
      ==11699==ABORTING
      

      Also reproducible on 10.5.
      Not reproducible (with the provided test case) on 10.3.

      Attachments

        Issue Links

          Activity

            elenst, can you try this crude patch?
            If it helps, then we could change the debug instrumentation to use heap allocation, instead of allocating at least 64KiB from the stack.

            diff --git a/storage/innobase/fil/fil0crypt.cc b/storage/innobase/fil/fil0crypt.cc
            --- a/storage/innobase/fil/fil0crypt.cc
            +++ b/storage/innobase/fil/fil0crypt.cc
            @@ -763,7 +763,7 @@ fil_space_encrypt(
             				    src_frame, zip_size, dst_frame,
             				    full_crc32);
             
            -#ifdef UNIV_DEBUG
            +#if 0
             	if (tmp) {
             		/* Verify that encrypted buffer is not corrupted */
             		dberr_t err = DB_SUCCESS;
            

            marko Marko Mäkelä added a comment - elenst , can you try this crude patch? If it helps, then we could change the debug instrumentation to use heap allocation, instead of allocating at least 64KiB from the stack. diff --git a/storage/innobase/fil/fil0crypt.cc b/storage/innobase/fil/fil0crypt.cc --- a/storage/innobase/fil/fil0crypt.cc +++ b/storage/innobase/fil/fil0crypt.cc @@ -763,7 +763,7 @@ fil_space_encrypt( src_frame, zip_size, dst_frame, full_crc32); -#ifdef UNIV_DEBUG +#if 0 if (tmp) { /* Verify that encrypted buffer is not corrupted */ dberr_t err = DB_SUCCESS;

            mleich, could you test marko ' s patch please if it needs testing?

            elenst Elena Stepanova added a comment - mleich , could you test marko ' s patch please if it needs testing?
            mleich Matthias Leich added a comment - - edited

            I have tried the test with the setup (see above) against
            - actual 10.4
            - 10.4 commit 13f36fffeaecf316435fc497b0f3ae2a5d58d749   (10.4.8)
            in both cases compiled with debug but gut never a replay.
            

            mleich Matthias Leich added a comment - - edited I have tried the test with the setup (see above) against - actual 10.4 - 10.4 commit 13f36fffeaecf316435fc497b0f3ae2a5d58d749 (10.4.8) in both cases compiled with debug but gut never a replay.

            The test case doesn't fail for me with the patch provided above, neither on debug nor on ASAN builds. It still fails readily on the current 10.4 tree without the patch.

            elenst Elena Stepanova added a comment - The test case doesn't fail for me with the patch provided above, neither on debug nor on ASAN builds. It still fails readily on the current 10.4 tree without the patch.

            I think that the debug check that was added to fil_space_crypt() in MDEV-9931 is totally useless and must be removed. In the encryption and mariabackup test suites, we are exercising the page decryption well enough. It is not necessary to decrypt and compare pages every time after encrypting a page.

            marko Marko Mäkelä added a comment - I think that the debug check that was added to fil_space_crypt() in MDEV-9931 is totally useless and must be removed. In the encryption and mariabackup test suites, we are exercising the page decryption well enough. It is not necessary to decrypt and compare pages every time after encrypting a page.

            I removed the useless debug check. This did not affect release builds.

            marko Marko Mäkelä added a comment - I removed the useless debug check. This did not affect release builds.

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.