Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28736

Attempts to join crashed node to the cluster through IST ends with another crash after wsrep_recover

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Incomplete
    • 10.6.8
    • N/A
    • Galera, Platform FreeBSD
    • None
    • FreeBSD 13; using custom build TODO-3345

    Description

      The customer reports that an attempt to join crashed node to the cluster through IST ends with another crash after wsrep_recover. Wsrep thread did not wait that recover thread to
      finish the job, but initiate IST procedure and crashes again - backtraces are below.
      And full error log is in the case. Platform is FreeBSD.

      2022-05-15  1:55:21 96 [Note] WSREP: Starting applier thread 96
      2022-05-15  1:55:21 97 [Note] WSREP: Starting applier thread 97
      2022-05-15  1:55:21 98 [Note] WSREP: Starting applier thread 98
      2022-05-15  1:55:21 101 [Note] WSREP: Starting applier thread 101
      2022-05-15  1:55:21 99 [Note] WSREP: Starting applier thread 99
      2022-05-15  1:55:21 0 [Note] /usr/local/libexec/mariadbd: ready for connections.
      Version: '10.6.8-4-MariaDB-enterprise-log'  socket: '/tmp/mysql.sock'  port: 3306  MariaDB Enterprise Server
      2022-05-15  1:55:21 103 [Note] WSREP: Starting applier thread 103
      0x13ef71e <my_print_stacktrace+0x2e> at /usr/local/libexec/mariadbd
      mysys/my_addr_resolve.c:299(my_addr_resolve)[0xce6460]
      0x801935580 <pthread_sigmask+0x540> at /lib/libthr.so.3
      thread/thr_sig.c:0(handle_signal)[0x801934b3f]
      0x7ffffffff2d3 <__gxx_personality_v0+0x7ffffeb6ae03> at ???
      0xd9ec24 <thd_get_thread_id+0x4> at /usr/local/libexec/mariadbd
      mysys/my_addr_resolve.c:299(my_addr_resolve)[0x12b9590]
      0x12b8f9b <_Z9lock_waitP9que_thr_t+0x5b> at /usr/local/libexec/mariadbd
      0x132f9b1 <_Z23row_mysql_handle_errorsP7dberr_tP5trx_tP9que_thr_tP12trx_savept_t+0x61> at /usr/local/libexec/mariadbd
      sql/mysqld.cc:1848(unireg_abort)[0x13497a6]
      maria/ma_check.c:3055(writekeys)[0x11a7891]
      maria/ma_check.c:3159(_ma_flush_table_files_before_swap)[0xbc80e2]
      0xbd3d29 <_ZN7handler17rnd_pos_by_recordEPh+0x59> at /usr/local/libexec/mariadbd
      perfschema/pfs_stat.h:76(PFS_single_stat::aggregate(PFS_single_stat const*))[0xc97b40]
      0xc98b70 <_ZN21Update_rows_log_event11do_exec_rowEP14rpl_group_info+0x190> at /usr/local/libexec/mariadbd
      perfschema/table_events_statements.cc:239(table_events_statements_common::make_row_part_1(PFS_events_statements*, sql_digest_storage*))[0xc92f88]
      0xd351b8 <_ZN9Log_event11apply_eventEP14rpl_group_info+0x68> at /usr/local/libexec/mariadbd
      0x1189d12 <_Z18wsrep_apply_eventsP3THDP14Relay_log_infoPKvm+0x392> at /usr/local/libexec/mariadbd
      sql/sys_vars.inl:508(Sys_var_charptr_base)[0x116ba87]
      0x147df0c <_ZN5wsrep12server_state8on_applyERNS_21high_priority_serviceERKNS_9ws_handleERKNS_7ws_metaERKNS_12const_bufferE+0x3cc> at /usr/local/libexec/mariadbd
      0x1489eef <_ZN12_GLOBAL__N_18apply_cbEPvPK15wsrep_ws_handlejPK9wsrep_bufPK14wsrep_trx_metaPb+0xaf> at /usr/local/libexec/mariadbd
      0x80391b690 <wsrep_ps_free_node_stat+0x9510> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
      src/trx_handle.cpp:392(galera::TrxHandleSlave::apply(void*, wsrep_cb_status (*)(void*, wsrep_ws_handle const*, unsigned int, wsrep_buf const*, wsrep_trx_meta const*, bool*), wsrep_trx_meta const&, bool&))[0x80392143a]
      0x803969b00 <wsrep_ps_free_node_stat+0x57980> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
      0x803969441 <wsrep_ps_free_node_stat+0x572c1> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
      0x803920847 <wsrep_ps_free_node_stat+0xe6c7> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
      src/replicator_smm.cpp:538(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandleSlave&))[0x80390b001]
      0x148a63b <_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0xb> at /usr/local/libexec/mariadbd
      mysys/my_addr_resolve.c:299(my_addr_resolve)[0x118a20d]
      0x117a083 <_Z15start_wsrep_THDPv+0x2e3> at /usr/local/libexec/mariadbd
      0x110c177 <pfs_spawn_thread+0xd7> at /usr/local/libexec/mariadbd
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x1d02204e5b): UPDATE nl_game_providers.game_bettings SET  ext_bet_status = 'Won' WHERE provider_id = '2' AND recno = '13076845542'
       
      Connection ID (thread ID): 37
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
       
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      information that should help you find out what is causing the crash.
      Core pattern: %N.core

      Attachments

        Activity

          YK Yakov Kushnirsky created issue -
          elenst Elena Stepanova made changes -
          Field Original Value New Value
          Component/s Galera [ 14918 ]
          Component/s Galera [ 10124 ]
          Component/s Platform FreeBSD [ 10139 ]
          Key MDEV-28600 MENT-1498
          Affects Version/s 10.6.8 [ 27506 ]
          Project MariaDB Server [ 10000 ] MariaDB Enterprise [ 11500 ]
          julien.fritsch Julien Fritsch made changes -
          Affects Version/s 10.6.8-4 [ 27611 ]
          julien.fritsch Julien Fritsch made changes -
          Fix Version/s 10.6 [ 24027 ]
          julien.fritsch Julien Fritsch made changes -
          Assignee Jan Lindström [ jplindst ]
          jplindst Jan Lindström (Inactive) made changes -
          Status Open [ 1 ] Needs Feedback [ 10501 ]
          hholzgra Hartmut Holzgraefe made changes -
          hholzgra Hartmut Holzgraefe made changes -
          Environment FreeBSD FreeBSD 13; using custom build TODO-3345
          YK Yakov Kushnirsky made changes -
          Status Needs Feedback [ 10501 ] Open [ 1 ]
          julien.fritsch Julien Fritsch made changes -
          Assignee Jan Lindström [ jplindst ] Yakov Kushnirsky [ JIRAUSER48657 ]
          Status Open [ 1 ] Needs Feedback [ 10501 ]
          julien.fritsch Julien Fritsch made changes -
          Assignee Yakov Kushnirsky [ JIRAUSER48657 ] Max Mether [ maxmether ]
          ralf.gebhardt Ralf Gebhardt made changes -
          Labels FreeBSD
          serg Sergei Golubchik made changes -
          Description The customer reports that an attempt to join crashed node to the cluster through IST ends with another crash after wsrep_recover. Wsrep thread did not wait that recover thread to
          finish the job, but initiate IST procedure and crashes again - backtraces are below.
          And full error log is in the case. Platform is FreeBSD.

          ~...
          2022-05-15 1:55:21 96 [Note] WSREP: Starting applier thread 96
          2022-05-15 1:55:21 97 [Note] WSREP: Starting applier thread 97
          2022-05-15 1:55:21 98 [Note] WSREP: Starting applier thread 98
          2022-05-15 1:55:21 101 [Note] WSREP: Starting applier thread 101
          2022-05-15 1:55:21 99 [Note] WSREP: Starting applier thread 99
          2022-05-15 1:55:21 0 [Note] /usr/local/libexec/mariadbd: ready for connections.
          Version: '10.6.8-4-MariaDB-enterprise-log' socket: '/tmp/mysql.sock' port: 3306 MariaDB Enterprise Server
          2022-05-15 1:55:21 103 [Note] WSREP: Starting applier thread 103
          0x13ef71e <my_print_stacktrace+0x2e> at /usr/local/libexec/mariadbd
          mysys/my_addr_resolve.c:299(my_addr_resolve)[0xce6460]
          0x801935580 <pthread_sigmask+0x540> at /lib/libthr.so.3
          thread/thr_sig.c:0(handle_signal)[0x801934b3f]
          0x7ffffffff2d3 <__gxx_personality_v0+0x7ffffeb6ae03> at ???
          0xd9ec24 <thd_get_thread_id+0x4> at /usr/local/libexec/mariadbd
          mysys/my_addr_resolve.c:299(my_addr_resolve)[0x12b9590]
          0x12b8f9b <_Z9lock_waitP9que_thr_t+0x5b> at /usr/local/libexec/mariadbd
          0x132f9b1 <_Z23row_mysql_handle_errorsP7dberr_tP5trx_tP9que_thr_tP12trx_savept_t+0x61> at /usr/local/libexec/mariadbd
          sql/mysqld.cc:1848(unireg_abort)[0x13497a6]
          maria/ma_check.c:3055(writekeys)[0x11a7891]
          maria/ma_check.c:3159(_ma_flush_table_files_before_swap)[0xbc80e2]
          0xbd3d29 <_ZN7handler17rnd_pos_by_recordEPh+0x59> at /usr/local/libexec/mariadbd
          perfschema/pfs_stat.h:76(PFS_single_stat::aggregate(PFS_single_stat const*))[0xc97b40]
          0xc98b70 <_ZN21Update_rows_log_event11do_exec_rowEP14rpl_group_info+0x190> at /usr/local/libexec/mariadbd
          perfschema/table_events_statements.cc:239(table_events_statements_common::make_row_part_1(PFS_events_statements*, sql_digest_storage*))[0xc92f88]
          0xd351b8 <_ZN9Log_event11apply_eventEP14rpl_group_info+0x68> at /usr/local/libexec/mariadbd
          0x1189d12 <_Z18wsrep_apply_eventsP3THDP14Relay_log_infoPKvm+0x392> at /usr/local/libexec/mariadbd
          sql/sys_vars.inl:508(Sys_var_charptr_base)[0x116ba87]
          0x147df0c <_ZN5wsrep12server_state8on_applyERNS_21high_priority_serviceERKNS_9ws_handleERKNS_7ws_metaERKNS_12const_bufferE+0x3cc> at /usr/local/libexec/mariadbd
          0x1489eef <_ZN12_GLOBAL__N_18apply_cbEPvPK15wsrep_ws_handlejPK9wsrep_bufPK14wsrep_trx_metaPb+0xaf> at /usr/local/libexec/mariadbd
          0x80391b690 <wsrep_ps_free_node_stat+0x9510> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
          src/trx_handle.cpp:392(galera::TrxHandleSlave::apply(void*, wsrep_cb_status (*)(void*, wsrep_ws_handle const*, unsigned int, wsrep_buf const*, wsrep_trx_meta const*, bool*), wsrep_trx_meta const&, bool&))[0x80392143a]
          0x803969b00 <wsrep_ps_free_node_stat+0x57980> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
          0x803969441 <wsrep_ps_free_node_stat+0x572c1> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
          0x803920847 <wsrep_ps_free_node_stat+0xe6c7> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
          src/replicator_smm.cpp:538(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandleSlave&))[0x80390b001]
          0x148a63b <_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0xb> at /usr/local/libexec/mariadbd
          mysys/my_addr_resolve.c:299(my_addr_resolve)[0x118a20d]
          0x117a083 <_Z15start_wsrep_THDPv+0x2e3> at /usr/local/libexec/mariadbd
          0x110c177 <pfs_spawn_thread+0xd7> at /usr/local/libexec/mariadbd

          Trying to get some variables.
          Some pointers may be invalid and cause the dump to abort.
          Query (0x1d02204e5b): UPDATE nl_game_providers.game_bettings SET ext_bet_status = 'Won' WHERE provider_id = '2' AND recno = '13076845542'

          Connection ID (thread ID): 37
          Status: NOT_KILLED

          Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off

          The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
          information that should help you find out what is causing the crash.
          Core pattern: %N.core~
          The customer reports that an attempt to join crashed node to the cluster through IST ends with another crash after wsrep_recover. Wsrep thread did not wait that recover thread to
          finish the job, but initiate IST procedure and crashes again - backtraces are below.
          And full error log is in the case. Platform is FreeBSD.

          {noformat}
          2022-05-15 1:55:21 96 [Note] WSREP: Starting applier thread 96
          2022-05-15 1:55:21 97 [Note] WSREP: Starting applier thread 97
          2022-05-15 1:55:21 98 [Note] WSREP: Starting applier thread 98
          2022-05-15 1:55:21 101 [Note] WSREP: Starting applier thread 101
          2022-05-15 1:55:21 99 [Note] WSREP: Starting applier thread 99
          2022-05-15 1:55:21 0 [Note] /usr/local/libexec/mariadbd: ready for connections.
          Version: '10.6.8-4-MariaDB-enterprise-log' socket: '/tmp/mysql.sock' port: 3306 MariaDB Enterprise Server
          2022-05-15 1:55:21 103 [Note] WSREP: Starting applier thread 103
          0x13ef71e <my_print_stacktrace+0x2e> at /usr/local/libexec/mariadbd
          mysys/my_addr_resolve.c:299(my_addr_resolve)[0xce6460]
          0x801935580 <pthread_sigmask+0x540> at /lib/libthr.so.3
          thread/thr_sig.c:0(handle_signal)[0x801934b3f]
          0x7ffffffff2d3 <__gxx_personality_v0+0x7ffffeb6ae03> at ???
          0xd9ec24 <thd_get_thread_id+0x4> at /usr/local/libexec/mariadbd
          mysys/my_addr_resolve.c:299(my_addr_resolve)[0x12b9590]
          0x12b8f9b <_Z9lock_waitP9que_thr_t+0x5b> at /usr/local/libexec/mariadbd
          0x132f9b1 <_Z23row_mysql_handle_errorsP7dberr_tP5trx_tP9que_thr_tP12trx_savept_t+0x61> at /usr/local/libexec/mariadbd
          sql/mysqld.cc:1848(unireg_abort)[0x13497a6]
          maria/ma_check.c:3055(writekeys)[0x11a7891]
          maria/ma_check.c:3159(_ma_flush_table_files_before_swap)[0xbc80e2]
          0xbd3d29 <_ZN7handler17rnd_pos_by_recordEPh+0x59> at /usr/local/libexec/mariadbd
          perfschema/pfs_stat.h:76(PFS_single_stat::aggregate(PFS_single_stat const*))[0xc97b40]
          0xc98b70 <_ZN21Update_rows_log_event11do_exec_rowEP14rpl_group_info+0x190> at /usr/local/libexec/mariadbd
          perfschema/table_events_statements.cc:239(table_events_statements_common::make_row_part_1(PFS_events_statements*, sql_digest_storage*))[0xc92f88]
          0xd351b8 <_ZN9Log_event11apply_eventEP14rpl_group_info+0x68> at /usr/local/libexec/mariadbd
          0x1189d12 <_Z18wsrep_apply_eventsP3THDP14Relay_log_infoPKvm+0x392> at /usr/local/libexec/mariadbd
          sql/sys_vars.inl:508(Sys_var_charptr_base)[0x116ba87]
          0x147df0c <_ZN5wsrep12server_state8on_applyERNS_21high_priority_serviceERKNS_9ws_handleERKNS_7ws_metaERKNS_12const_bufferE+0x3cc> at /usr/local/libexec/mariadbd
          0x1489eef <_ZN12_GLOBAL__N_18apply_cbEPvPK15wsrep_ws_handlejPK9wsrep_bufPK14wsrep_trx_metaPb+0xaf> at /usr/local/libexec/mariadbd
          0x80391b690 <wsrep_ps_free_node_stat+0x9510> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
          src/trx_handle.cpp:392(galera::TrxHandleSlave::apply(void*, wsrep_cb_status (*)(void*, wsrep_ws_handle const*, unsigned int, wsrep_buf const*, wsrep_trx_meta const*, bool*), wsrep_trx_meta const&, bool&))[0x80392143a]
          0x803969b00 <wsrep_ps_free_node_stat+0x57980> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
          0x803969441 <wsrep_ps_free_node_stat+0x572c1> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
          0x803920847 <wsrep_ps_free_node_stat+0xe6c7> at /rw_part/usr-local/lib/mysql/libgalera_enterprise_smm.so
          src/replicator_smm.cpp:538(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandleSlave&))[0x80390b001]
          0x148a63b <_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0xb> at /usr/local/libexec/mariadbd
          mysys/my_addr_resolve.c:299(my_addr_resolve)[0x118a20d]
          0x117a083 <_Z15start_wsrep_THDPv+0x2e3> at /usr/local/libexec/mariadbd
          0x110c177 <pfs_spawn_thread+0xd7> at /usr/local/libexec/mariadbd

          Trying to get some variables.
          Some pointers may be invalid and cause the dump to abort.
          Query (0x1d02204e5b): UPDATE nl_game_providers.game_bettings SET ext_bet_status = 'Won' WHERE provider_id = '2' AND recno = '13076845542'

          Connection ID (thread ID): 37
          Status: NOT_KILLED

          Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off

          The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
          information that should help you find out what is causing the crash.
          Core pattern: %N.core{noformat}
          julien.fritsch Julien Fritsch made changes -
          Component/s Platform FreeBSD [ 18806 ]
          julien.fritsch Julien Fritsch made changes -
          Labels FreeBSD
          julien.fritsch Julien Fritsch made changes -
          Assignee Max Mether [ maxmether ] Yakov Kushnirsky [ JIRAUSER48657 ]
          julien.fritsch Julien Fritsch made changes -
          Component/s Galera [ 10124 ]
          Component/s Platform FreeBSD [ 10139 ]
          Component/s Galera [ 14918 ]
          Component/s Platform FreeBSD [ 18806 ]
          Fix Version/s 10.6 [ 24028 ]
          Fix Version/s 10.6 [ 24027 ]
          Key MENT-1498 MDEV-28736
          Affects Version/s 10.6.8-4 [ 27611 ]
          Assignee Yakov Kushnirsky [ JIRAUSER48657 ]
          Project MariaDB Enterprise [ 11500 ] MariaDB Server [ 10000 ]
          julien.fritsch Julien Fritsch made changes -
          Affects Version/s 10.6.8 [ 27506 ]
          julien.fritsch Julien Fritsch made changes -
          Assignee Jan Lindström [ jplindst ]
          julien.fritsch Julien Fritsch made changes -
          Assignee Jan Lindström [ jplindst ] Yakov Kushnirsky [ JIRAUSER48657 ]
          ralf.gebhardt Ralf Gebhardt made changes -
          Fix Version/s N/A [ 14700 ]
          Fix Version/s 10.6 [ 24028 ]
          Resolution Incomplete [ 4 ]
          Status Needs Feedback [ 10501 ] Closed [ 6 ]
          mariadb-jira-automation Jira Automation (IT) made changes -
          Zendesk Related Tickets 134987

          People

            YK Yakov Kushnirsky
            YK Yakov Kushnirsky
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.