Error 1677 Cluster failing due to sync issues



    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 10.0.38, 10.0.38-galera
    • Fix Version/s: N/A
    • Environment:


      I am running a Galera cluster with two nodes and an arbitrator

      Log Details

      This error occurred around 8:30:00 hours on 7th October 2019. I am attaching the full log plus relevant logs(for the date range - named with the suffix relevant). mysql1 and mysql2 are the two nodes running in the cluster

      This is the query which was ran around this time. Not certain if this is what caused it. Maybe it was a coincidence - mentioning here in case it helps.

      alter table ch_newdata
          add column feedback_source enum ('mobile', 'plato') null default 'plato' comment 'Source of the feedback' after is_recomended;

      Log Snippet with mysql error (from the mysql1 file)

      191007 8:30:58 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.

      To report this bug, see https://mariadb.com/kb/en/reporting-bugs

      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.

      Server version: 10.0.38-MariaDB-wsrep
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467095 K bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.

      Thread pointer: 0x7f86b3afb008
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f87074c4cf0 thread_stack 0x48000

      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x0): is an invalid pointer
      Connection ID (thread ID): 26317960
      Status: NOT_KILLED

      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.

      We think the query pointer is invalid, but we will try to print it anyway.

      191007 08:30:58 mysqld_safe Number of processes running now: 0
      191007 08:30:58 mysqld_safe WSREP: not restarting wsrep node automatically
      191007 08:30:58 mysqld_safe mysqld from pid file /var/lib/mysql/ip-172-31-19-249.ap-south-1.compute.internal.pid ended
      191007 08:51:00 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
      191007 08:51:00 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.R8etVm' --pid-file='/var/lib/mysql/ip-172-31-19-249.ap-south-1.compute.internal-recover.pid'
      191007 8:51:00 [Note] /usr/sbin/mysqld (mysqld 10.0.38-MariaDB-wsrep) starting as process 25772 ...
      191007 08:51:04 mysqld_safe WSREP: Recovered position d7d97fae-52b8-11e9-86bb-5611caffc126:18170886
      191007 8:51:04 [Note] /usr/sbin/mysqld (mysqld 10.0.38-MariaDB-wsrep) starting as process 25814 ...
      191007 8:51:04 [Note] WSREP: Read nil XID from storage engines, skipping position init
      191007 8:51:04 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
      191007 8:51:04 [Note] WSREP: wsrep_load(): Galera 25.3.25(r3836) by Codership Oy <info@codership.com> loaded successfully.
      191007 8:51:04 [Note] WSREP: CRC-32C: using hardware acceleration.
      191007 8:51:04 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0
      191007 8:51:04 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host =; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S
      191007 8:51:04 [Note] WSREP: GCache history reset: d7d97fae-52b8-11e9-86bb-5611caffc126:0 -> 00000000-0000-0000-0000-000000000000:-1


