Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18955

WSREP: Failed to start mysqld for wsrep recovery: '2019-03-18 9:13:28 140653183248640 [Note] /usr/sbin/mysqld (mysqld 10.1.32-MariaDB) starting as process 23374

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Incomplete
    • 10.1.32
    • N/A
    • Galera
    • Mariadb galera 5 node cluster

    Description

      We have 5 galera cluster node 1 node is not getting up after server reboot.

      [root@AZABNL-ID03 my.cnf.d]# galera_recovery
      WSREP: Failed to start mysqld for wsrep recovery: '2019-03-18 9:13:28 140653183248640 [Note] /usr/sbin/mysqld (mysqld 10.1.32-MariaDB) starting as process 23374 ...
      2019-03-18 9:13:28 140653183248640 [Note] Loaded 'file_key_management.so' with offset 0x7fec4f9fb000
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Using mutexes to ref count buffer pool pages
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: The InnoDB memory heap is disabled
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Compressed tables use zlib 1.2.7
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Using Linux native AIO
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Using SSE crc32 instructions
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Initializing buffer pool, size = 3.0G
      2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Completed initialization of buffer pool
      2019-03-18 9:13:29 140653183248640 [Note] InnoDB: Highest supported file format is Barracuda.
      2019-03-18 9:13:29 140653183248640 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1796573045811
      2019-03-18 9:13:31 140653183248640 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
      2019-03-18 9:13:31 140653183248640 [Note] InnoDB: Starting final batch to recover 40 pages from redo log
      2019-03-18 9:13:31 140653183248640 [ERROR] InnoDB: Trying to access page number 5 in space 229934 space name DB_POLYMERS_P/honeypot_user, which is outside the tablespace bounds. Byte offset 0, len 16384 i/o type 10.
      2019-03-18 09:13:31 7fec5f062900 InnoDB: Assertion failure in thread 140653183248640 in file ha_innodb.cc line 22028
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
      InnoDB: about forcing recovery.
      190318 9:13:31 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.

      To report this bug, see https://mariadb.com/kb/en/reporting-bugs

      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.

      Server version: 10.1.32-MariaDB
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=0
      max_threads=2002
      thread_count=0
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4529069 K bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.

      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      2019-03-18 09:13:31 7feb73bfa700 InnoDB: Assertion failure in thread 140649235982080 in file rem0rec.cc line 581
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
      InnoDB: about forcing recovery.
      2019-03-18 09:13:31 7feb743fb700 InnoDB: Assertion failure in thread 140649244374784 in file rem0rec.cc line 581
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
      InnoDB: about forcing recovery.
      /bin/galera_recovery: line 71: 23374 Aborted /usr/sbin/mysqld --user=mysql --wsrep_recover --disable-log-error'

      error log =====================>

      2019-03-17 2:18:48 139653421607680 [Note] WSREP: New cluster view: global state: 53540047-107d-11e6-8b2a-9a31eea4d5df:457370708, view# 433: Primary, number of nodes: 5, my index: 1, protocol version 3
      2019-03-17 2:18:48 139653421607680 [Warning] WSREP: Gap in state sequence. Need state transfer.
      2019-03-17 2:18:48 139653088802560 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.134.18.4' --datadir '/mnt/data/' --parent '6668' '' '
      2019-03-17 2:18:48 139653421607680 [Note] WSREP: Prepared SST request: rsync|10.134.18.4:4444/rsync_sst
      2019-03-17 2:18:48 139653421607680 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2019-03-17 2:18:48 139653421607680 [Note] WSREP: REPL Protocols: 8 (3, 2)
      2019-03-17 2:18:48 139653421607680 [Note] WSREP: Assign initial position for certification: 457370708, protocol version: 3
      2019-03-17 2:18:48 139653176485632 [Note] WSREP: Service thread queue flushed.
      2019-03-17 2:18:48 139653421607680 [Note] WSREP: IST receiver addr using tcp://10.134.18.4:4568
      2019-03-17 2:18:48 139653421607680 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.134.18.4:4568
      2019-03-17 2:18:48 139653118158592 [Note] WSREP: Member 1.0 (azabnl-id03) requested state transfer from 'any'. Selected 0.0 (azabir-id01)(SYNCED) as donor.
      2019-03-17 2:18:48 139653118158592 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 457370718)
      2019-03-17 2:18:48 139653421607680 [Note] WSREP: Requesting state transfer: success, donor: 0
      2019-03-17 2:18:48 139653421607680 [Note] WSREP: GCache history reset: 53540047-107d-11e6-8b2a-9a31eea4d5df:0 -> 53540047-107d-11e6-8b2a-9a31eea4d5df:457370708
      2019-03-17 2:18:49 139653126551296 [Note] WSREP: (9c263f05, 'tcp://0.0.0.0:4567') turning message relay requesting off
      Terminated
      WSREP_SST: [INFO] Joiner cleanup. rsync PID: 6711 (20190317 02:20:14.190)
      WSREP_SST: [INFO] Joiner cleanup done. (20190317 02:20:14.697)
      2019-03-17 2:20:14 139653088802560 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '10.134.18.4' --datadir '/mnt/data/' --parent '6668' '' : 3 (No such process)
      2019-03-17 2:20:14 139653088802560 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
      2019-03-17 2:20:14 139653422983424 [ERROR] WSREP: SST failed: 3 (No such process)
      2019-03-17 2:20:14 139653422983424 [ERROR] Aborting

      2019-03-17 2:20:15 139653118158592 [Warning] WSREP: 0.0 (azabir-id01): State transfer to 1.0 (azabnl-id03) failed: -255 (Unknown error 255)
      2019-03-17 2:20:15 139653118158592 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():737: Will never receive state. Need to abort.

      Attachments

        Activity

          People

            jplindst Jan Lindström (Inactive)
            jeetupatil Jeetendra Patil
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.