[MDEV-18955] WSREP: Failed to start mysqld for wsrep recovery: '2019-03-18 9:13:28 140653183248640 [Note] /usr/sbin/mysqld (mysqld 10.1.32-MariaDB) starting as process 23374 - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Incomplete
Affects Version/s: 10.1.32
Fix Version/s: N/A
Component/s: Galera
Labels:
- galera
- need_feedback
Environment:
Mariadb galera 5 node cluster

Description

We have 5 galera cluster node 1 node is not getting up after server reboot.

[root@AZABNL-ID03 my.cnf.d]# galera_recovery
WSREP: Failed to start mysqld for wsrep recovery: '2019-03-18 9:13:28 140653183248640 [Note] /usr/sbin/mysqld (mysqld 10.1.32-MariaDB) starting as process 23374 ...
2019-03-18 9:13:28 140653183248640 [Note] Loaded 'file_key_management.so' with offset 0x7fec4f9fb000
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: The InnoDB memory heap is disabled
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Compressed tables use zlib 1.2.7
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Using Linux native AIO
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Using SSE crc32 instructions
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Initializing buffer pool, size = 3.0G
2019-03-18 9:13:28 140653183248640 [Note] InnoDB: Completed initialization of buffer pool
2019-03-18 9:13:29 140653183248640 [Note] InnoDB: Highest supported file format is Barracuda.
2019-03-18 9:13:29 140653183248640 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1796573045811
2019-03-18 9:13:31 140653183248640 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
2019-03-18 9:13:31 140653183248640 [Note] InnoDB: Starting final batch to recover 40 pages from redo log
2019-03-18 9:13:31 140653183248640 [ERROR] InnoDB: Trying to access page number 5 in space 229934 space name DB_POLYMERS_P/honeypot_user, which is outside the tablespace bounds. Byte offset 0, len 16384 i/o type 10.
2019-03-18 09:13:31 7fec5f062900 InnoDB: Assertion failure in thread 140653183248640 in file ha_innodb.cc line 22028
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
190318 9:13:31 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.1.32-MariaDB
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=0
max_threads=2002
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4529069 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
2019-03-18 09:13:31 7feb73bfa700 InnoDB: Assertion failure in thread 140649235982080 in file rem0rec.cc line 581
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
2019-03-18 09:13:31 7feb743fb700 InnoDB: Assertion failure in thread 140649244374784 in file rem0rec.cc line 581
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
/bin/galera_recovery: line 71: 23374 Aborted /usr/sbin/mysqld --user=mysql --wsrep_recover --disable-log-error'

error log =====================>

2019-03-17 2:18:48 139653421607680 [Note] WSREP: New cluster view: global state: 53540047-107d-11e6-8b2a-9a31eea4d5df:457370708, view# 433: Primary, number of nodes: 5, my index: 1, protocol version 3
2019-03-17 2:18:48 139653421607680 [Warning] WSREP: Gap in state sequence. Need state transfer.
2019-03-17 2:18:48 139653088802560 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.134.18.4' --datadir '/mnt/data/' --parent '6668' '' '
2019-03-17 2:18:48 139653421607680 [Note] WSREP: Prepared SST request: rsync|10.134.18.4:4444/rsync_sst
2019-03-17 2:18:48 139653421607680 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2019-03-17 2:18:48 139653421607680 [Note] WSREP: REPL Protocols: 8 (3, 2)
2019-03-17 2:18:48 139653421607680 [Note] WSREP: Assign initial position for certification: 457370708, protocol version: 3
2019-03-17 2:18:48 139653176485632 [Note] WSREP: Service thread queue flushed.
2019-03-17 2:18:48 139653421607680 [Note] WSREP: IST receiver addr using tcp://10.134.18.4:4568
2019-03-17 2:18:48 139653421607680 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.134.18.4:4568
2019-03-17 2:18:48 139653118158592 [Note] WSREP: Member 1.0 (azabnl-id03) requested state transfer from 'any'. Selected 0.0 (azabir-id01)(SYNCED) as donor.
2019-03-17 2:18:48 139653118158592 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 457370718)
2019-03-17 2:18:48 139653421607680 [Note] WSREP: Requesting state transfer: success, donor: 0
2019-03-17 2:18:48 139653421607680 [Note] WSREP: GCache history reset: 53540047-107d-11e6-8b2a-9a31eea4d5df:0 -> 53540047-107d-11e6-8b2a-9a31eea4d5df:457370708
2019-03-17 2:18:49 139653126551296 [Note] WSREP: (9c263f05, 'tcp://0.0.0.0:4567') turning message relay requesting off
Terminated
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 6711 (20190317 02:20:14.190)
WSREP_SST: [INFO] Joiner cleanup done. (20190317 02:20:14.697)
2019-03-17 2:20:14 139653088802560 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '10.134.18.4' --datadir '/mnt/data/' --parent '6668' '' : 3 (No such process)
2019-03-17 2:20:14 139653088802560 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
2019-03-17 2:20:14 139653422983424 [ERROR] WSREP: SST failed: 3 (No such process)
2019-03-17 2:20:14 139653422983424 [ERROR] Aborting

2019-03-17 2:20:15 139653118158592 [Warning] WSREP: 0.0 (azabir-id01): State transfer to 1.0 (azabnl-id03) failed: -255 (Unknown error 255)
2019-03-17 2:20:15 139653118158592 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():737: Will never receive state. Need to abort.

WSREP: Failed to start mysqld for wsrep recovery: '2019-03-18 9:13:28 140653183248640 [Note] /usr/sbin/mysqld (mysqld 10.1.32-MariaDB) starting as process 23374

Details

Description

Attachments

Activity

People

Dates

Git Integration