[MDEV-10480] Unable to start Galera cluster Created: 2016-08-02  Updated: 2016-08-05  Resolved: 2016-08-05

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.1.16
Fix Version/s: 10.1.17

Type: Bug Priority: Major
Reporter: Artur Čuvašov Assignee: Nirbhay Choubey (Inactive)
Resolution: Duplicate Votes: 0
Labels: galera
Environment:

Ubuntu 16.04


Issue Links:
Duplicate
is duplicated by MDEV-10396 MariaDB does not restart after upgrad... Closed

 Description   

One of cluster nodes crashed and was unable to join the group with the following message:

2016-08-01 19:50:19 140521508780288 [Note] WSREP: Read nil XID from storage engines, skipping position init
2016-08-01 19:50:19 140521508780288 [Note] WSREP: wsrep_load(): loading provider library 'none'
2016-08-01 19:50:19 140521508780288 [Note] Using unique option prefix 'myisam-recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.
2016-08-01 19:50:19 140521508780288 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-08-01 19:50:19 140521508780288 [Note] InnoDB: The InnoDB memory heap is disabled
2016-08-01 19:50:19 140521508780288 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-08-01 19:50:19 140521508780288 [Note] InnoDB: Memory barrier is not used
2016-08-01 19:50:19 140521508780288 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-08-01 19:50:19 140521508780288 [Note] InnoDB: Using Linux native AIO
2016-08-01 19:50:19 140521508780288 [Note] InnoDB: Using SSE crc32 instructions
2016-08-01 19:50:19 140521508780288 [Note] InnoDB: Initializing buffer pool, size = 54.0G
2016-08-01 19:50:24 140521508780288 [Note] InnoDB: Completed initialization of buffer pool
2016-08-01 19:50:26 140521508780288 [Note] InnoDB: Highest supported file format is Barracuda.
2016-08-01 19:50:27 140521508780288 [Note] InnoDB: 128 rollback segment(s) are active.
2016-08-01 19:50:27 140521508780288 [Note] InnoDB: Waiting for purge to start
2016-08-01 19:50:28 140521508780288 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.30-76.3 started; log sequence number 5155418059026
2016-08-01 19:50:28 140457849882368 [Note] InnoDB: Dumping buffer pool(s) not yet started
2016-08-01 19:50:28 140521508780288 [Note] Plugin 'FEEDBACK' is disabled.
2016-08-01 19:50:32 140111175645440 [Note] Using unique option prefix 'myisam-recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.
2016-08-01 19:50:32 140111175645440 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-08-01 19:50:32 140111175645440 [Note] InnoDB: The InnoDB memory heap is disabled
2016-08-01 19:50:32 140111175645440 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-08-01 19:50:32 140111175645440 [Note] InnoDB: Memory barrier is not used
2016-08-01 19:50:32 140111175645440 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-08-01 19:50:32 140111175645440 [Note] InnoDB: Using Linux native AIO
2016-08-01 19:50:32 140111175645440 [Note] InnoDB: Using SSE crc32 instructions
2016-08-01 19:50:32 140111175645440 [Note] InnoDB: Initializing buffer pool, size = 54.0G
2016-08-01 19:50:37 140111175645440 [Note] InnoDB: Completed initialization of buffer pool
2016-08-01 19:50:39 140111175645440 [Note] InnoDB: Highest supported file format is Barracuda.
2016-08-01 19:50:40 140111175645440 [Note] InnoDB: 128 rollback segment(s) are active.
2016-08-01 19:50:40 140111175645440 [Note] InnoDB: Waiting for purge to start
2016-08-01 19:50:40 140111175645440 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.30-76.3 started; log sequence number 5155418059036
2016-08-01 19:50:40 140111175645440 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
2016-08-01 19:50:40 140111175645440 [Note] Plugin 'FEEDBACK' is disabled.
2016-08-01 19:50:40 140111175645440 [Note] Recovering after a crash using /var/lib/mysql/mysql-bin
2016-08-01 19:50:40 140111175645440 [ERROR] Binlog file '/var/lib/mysql/mysql-bin.000025' not found in binlog index, needed for recovery. Aborting.
2016-08-01 19:50:40 140111175645440 [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit$
2016-08-01 19:50:40 140111175645440 [ERROR] Can't init tc log
2016-08-01 19:50:40 140111175645440 [ERROR] Aborting

Removing /var/lib/mysql for a full resync did not do a trick. I also noticed that db2 (the donor) had mysql-bin logs numbered 0025-0038, however during the SST only the last log file has been transferred.

Next step was to reinitiate the cluster, however a working node db2 did not start up again after a restart:

Aug  1 21:18:24 db2 systemd[1]: Starting MariaDB database server...
Aug  1 21:18:32 db2 mysqld[17595]: WSREP: Failed to recover position: '2016-08-01 21:18:25 139645985028352 [Note] /usr/sbin/mysqld (mysqld 10.1.16-MariaDB-1~xenial) starting as process 17759 ...'
Aug  1 21:18:32 db2 systemd[1]: mariadb.service: Control process exited, code=exited status=1
Aug  1 21:18:32 db2 systemd[1]: Failed to start MariaDB database server.
Aug  1 21:18:32 db2 systemd[1]: mariadb.service: Unit entered failed state.
Aug  1 21:18:32 db2 systemd[1]: mariadb.service: Failed with result 'exit-code'.

Same message is shown if I bootstrap db1, even with backups taken 2 months ago (using xtrabackup).

When I disable wsrep_on=off, the database starts up correctly.

MariaDB 10.1.16
xtrabackup 2.3.5
galera-3 25.3.15



 Comments   
Comment by Artur Čuvašov [ 2016-08-03 ]

I tried to initiate a clean database with mysql_install_db. The server started up just fine with wsrep=off, however when I turn it "ON" I get the following in mysql-error.log:

2016-08-03 13:25:16 140330179614976 [Note] Using unique option prefix 'myisam-recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: The InnoDB memory heap is disabled
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Memory barrier is not used
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Using Linux native AIO
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Using SSE crc32 instructions
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Initializing buffer pool, size = 54.0G
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Completed initialization of buffer pool
2016-08-03 13:25:17 140330179614976 [Note] InnoDB: Highest supported file format is Barracuda.
2016-08-03 13:25:18 140330179614976 [Note] InnoDB: 128 rollback segment(s) are active.
2016-08-03 13:25:18 140330179614976 [Note] InnoDB: Waiting for purge to start
2016-08-03 13:25:18 140330179614976 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.30-76.3 started; log sequence number 1617053
2016-08-03 13:25:18 140330179614976 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
2016-08-03 13:25:18 140330179614976 [Note] Plugin 'FEEDBACK' is disabled.
2016-08-03 13:25:18 140330179614976 [Note] Server socket created on IP: '0.0.0.0'.
2016-08-03 13:25:18 140330179614976 [Note] WSREP: Recovered position: 00000000-0000-0000-0000-000000000000:-1

and this in syslog:

Aug  3 13:25:16 db1 systemd[1]: Starting MariaDB database server...
Aug  3 13:25:20 db1 mysqld[571]: WSREP: Failed to recover position: '2016-08-03 13:25:16 140330179614976 [Note] /usr/sbin/mysqld (mysqld 10.1.16-MariaDB-1~xenial) starting as process 755 ...'
Aug  3 13:25:20 db1 systemd[1]: mariadb.service: Control process exited, code=exited status=1
Aug  3 13:25:20 db1 systemd[1]: Failed to start MariaDB database server.
Aug  3 13:25:20 db1 systemd[1]: mariadb.service: Unit entered failed state.
Aug  3 13:25:20 db1 systemd[1]: mariadb.service: Failed with result 'exit-code'.

my.cnf:

[mysql]
 
# CLIENT #
#port                           = 3306
socket                         = /var/lib/mysql/mysql.sock
 
[galera]
 
# GALERA #
wsrep_on=ON
 
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
wsrep_log_conflicts=1
 
## Galera Provider Configuration
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_provider_options="cert.log_conflicts=1; gcache.size=32G"
 
## Galera Cluster Configuration
wsrep_cluster_name="galera"
wsrep_cluster_address="gcomm://"
#wsrep_cluster_address="gcomm://1.2.3.4,1.2.3.5"
#wsrep_cluster_address="gcomm://db1:3307,db2:3307"
 
## Galera Synchronization Congifuration
wsrep_sst_method=rsync
#wsrep_sst_auth=user:pass
 
## Galera Node Configuration
wsrep_node_address="1.2.3.4"
wsrep_node_name="db1"
 
[mysqld]
 
port                            = 3306
 
# GENERAL #
user                           = mysql
default-storage-engine         = InnoDB
socket                         = /var/lib/mysql/mysql.sock
pid-file                       = /var/lib/mysql/mysql.pid
 
# MyISAM #
key-buffer-size                = 32M
myisam-recover                 = FORCE,BACKUP
 
# SAFETY #
max-allowed-packet             = 16M
max-connect-errors             = 1000000
skip-name-resolve
#STRICT_TRANS_TABLES
sql-mode                       = ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
sysdate-is-now                 = 1
innodb                         = FORCE
innodb-strict-mode             = 1
 
# DATA STORAGE #
datadir                        = /var/lib/mysql/
 
# BINARY LOGGING #
log-bin                        = /var/lib/mysql/mysql-bin
expire-logs-days               = 14
sync-binlog                    = 1
 
# CACHES AND LIMITS #
tmp-table-size                 = 32M
max-heap-table-size            = 32M
query-cache-type               = 0
query-cache-size               = 0
max-connections                = 500
thread-cache-size              = 50
open-files-limit               = 65535
table-definition-cache         = 4096
table-open-cache               = 4096
 
# INNODB #
innodb-flush-method            = O_DIRECT
innodb-log-files-in-group      = 2
innodb-log-file-size           = 512M
innodb-flush-log-at-trx-commit = 0
innodb-file-per-table          = 1
innodb-buffer-pool-size        = 54G
 
# LOGGING #
log-error                      = /var/log/mysql/mysql-error.log
log-queries-not-using-indexes  = 1
slow-query-log                 = 1
slow-query-log-file            = /var/log/mysql/mysql-slow.log
general_log                    = 1
general_log_file               = /var/log/mysql/mysql-general.log

Generated at Thu Feb 08 07:42:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.