Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30237

Unable to bootstrap cluster from crashed last survivor

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.6.11
    • None
    • Galera
    • Ubuntu 20.04

    Description

      Hello,
      I am running 2 nodes MariaDB servers and a galera arbitrator node and today wanted to simulate crash so i powered off one of the nodes intranet-test1.
      The DB was running fine on the other node intranet-test2 but when i started intranet-test1 again the working node crashed and all instances of MariaDB server were down.

      I am sure the last survivor was intranet-test2 so i edited grastate.dat and changed safe_to_bootstrap: 1 and ran galera_new_cluster but this ended with error:

      {{2022-12-15 18:05:21 0 [Note] WSREP: Loading provider /usr/lib/libgalera_smm.so initial position: a6d885cf-23bd-11ed-a7b5-fb2da02f3a3d:80532
      2022-12-15 18:05:21 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
      2022-12-15 18:05:21 0 [Note] WSREP: wsrep_load(): Galera 26.4.13(rfe497aeb) by Codership Oy <info@codership.com> loaded successfully.
      2022-12-15 18:05:21 0 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
      2022-12-15 18:05:21 0 [Note] WSREP: SSL cipher list set to 'AES128-SHA256'
      2022-12-15 18:05:21 0 [Note] WSREP: Found saved state: a6d885cf-23bd-11ed-a7b5-fb2da02f3a3d:-1, safe_to_bootstrap: 1
      2022-12-15 18:05:21 0 [Note] WSREP: GCache DEBUG: opened preamble:
      Version: 2
      UUID: a6d885cf-23bd-11ed-a7b5-fb2da02f3a3d
      Seqno: -1 - -1
      Offset: -1
      Synced: 0
      2022-12-15 18:05:21 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: a6d885cf-23bd-11ed-a7b5-fb2da02f3a3d, offset: -1
      2022-12-15 18:05:21 0 [Note] WSREP: GCache::RingBuffer initial scan... 0.0% ( 0/134217752 bytes) complete.
      2022-12-15 18:05:21 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217752/134217752 bytes) complete.
      2022-12-15 18:05:21 0 [Note] WSREP: Recovering GCache ring buffer: found gapless sequence 15762582740729856-15762582740729856
      2022-12-15 18:05:21 0 [Note] WSREP: GCache::RingBuffer unused buffers scan... 0.0% ( 0/33310984 bytes) complete.
      2022-12-15 18:05:21 0 [Note] WSREP: Recovering GCache ring buffer: found 0/1 locked buffers
      2022-12-15 18:05:21 0 [Note] WSREP: Recovering GCache ring buffer: free space: 100906744/134217728
      2022-12-15 18:05:21 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...100.0% (33310984/33310984 bytes) complete.
      2022-12-15 18:05:21 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.2.55; base_port = 4567; cert.log_conflicts = ON; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.keep_plaintext_size = 128M; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.fc_single_primary = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0;
      2022-12-15 18:05:21 0 [Note] WSREP: SSL cipher list set to 'AES128-SHA256'
      2022-12-15 18:05:21 0 [Note] WSREP: Service thread queue flushed.
      2022-12-15 18:05:21 0 [Note] WSREP: ####### Assign initial position for certification: a6d885cf-23bd-11ed-a7b5-fb2da02f3a3d:80532, protocol version: -1
      munmap_chunk(): invalid pointer
      221215 18:05:21 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.

      To report this bug, see https://mariadb.com/kb/en/reporting-bugs

      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.

      Server version: 10.6.11-MariaDB-1:10.6.11+maria~ubu2004
      key_buffer_size=0
      read_buffer_size=131072
      max_used_connections=0
      max_threads=153
      thread_count=0
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 336894 K bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.

      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      Printing to addr2line failed
      /usr/sbin/mariadbd(my_print_stacktrace+0x32)[0x5612742148b2]
      /usr/sbin/mariadbd(handle_fatal_signal+0x485)[0x561273cd21a5]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fec95407420]
      /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fec94f0b00b]
      /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fec94eea859]
      /lib/x86_64-linux-gnu/libc.so.6(+0x8d26e)[0x7fec94f5526e]
      /lib/x86_64-linux-gnu/libc.so.6(+0x952fc)[0x7fec94f5d2fc]
      /lib/x86_64-linux-gnu/libc.so.6(+0x9554c)[0x7fec94f5d54c]
      /usr/lib/libgalera_smm.so(+0x1cdaec)[0x7fec94abaaec]
      /usr/lib/libgalera_smm.so(+0x1ce12e)[0x7fec94abb12e]
      /usr/lib/libgalera_smm.so(+0x1b300a)[0x7fec94aa000a]
      /usr/lib/libgalera_smm.so(+0x821c8)[0x7fec9496f1c8]
      /usr/lib/libgalera_smm.so(+0x51c52)[0x7fec9493ec52]
      /usr/sbin/mariadbd(ZN5wsrep18wsrep_provider_v26C1ERNS_12server_stateERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEESA_RKNS_8provider8servicesE+0x1ec)[0x5612742b26cc]
      /usr/sbin/mariadbd(ZN5wsrep8provider13make_providerERNS_12server_stateERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEESA_RKNS0_8servicesE+0x54)[0x5612742af494]
      /usr/sbin/mariadbd(ZN5wsrep12server_state13load_providerERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_RKNS_8provider8servicesE+0x1f3)[0x56127429a753]
      /usr/sbin/mariadbd(_Z10wsrep_initv+0x193)[0x561273f8f643]
      /usr/sbin/mariadbd(_Z18wsrep_init_startupb+0x16)[0x561273f8fcf6]
      /usr/sbin/mariadbd(+0x6d5eb1)[0x5612739c1eb1]
      /usr/sbin/mariadbd(_Z11mysqld_mainiPPc+0x3fd)[0x5612739c6bbd]
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fec94eec083]
      /usr/sbin/mariadbd(_start+0x2e)[0x5612739bb91e]
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      information that should help you find out what is causing the crash.
      Writing a core file...
      Working directory at /var/lib/mysql
      Resource Limits:
      Limit Soft Limit Hard Limit Units
      Max cpu time unlimited unlimited seconds
      Max file size unlimited unlimited bytes
      Max data size unlimited unlimited bytes
      Max stack size 8388608 unlimited bytes
      Max core file size 0 unlimited bytes
      Max resident set unlimited unlimited bytes
      Max processes 7803 7803 processes
      Max open files 32768 32768 files
      Max locked memory 65536 65536 bytes
      Max address space unlimited unlimited bytes
      Max file locks unlimited unlimited locks
      Max pending signals 7803 7803 signals
      Max msgqueue size 819200 819200 bytes
      Max nice priority 0 0
      Max realtime priority 0 0
      Max realtime timeout unlimited unlimited us
      Core pattern: |/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g – %E

      Kernel version: Linux version 5.4.0-125-generic (buildd@lcy02-amd64-083) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #141-Ubuntu SMP Wed Aug 10 13:42:03 UTC 2022}}

      Why i can't bootstrap from this node?

      Attachments

        Activity

          People

            Unassigned Unassigned
            miro Miro
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.