Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.5.35-galera
-
None
-
CentOS 6.4 x86_64
Description
Maria won't shut down following upgrade to MariaDB Galera Cluster 5.5.35.
Logs with wsrep_debug = 1:
140216 0:25:02 [Note] /usr/sbin/mysqld: Normal shutdown
140216 0:25:02 [Note] WSREP: Stop replication
140216 0:25:02 [Note] WSREP: Provider disconnect
140216 0:25:02 [Note] WSREP: Closing send monitor...
140216 0:25:02 [Note] WSREP: Closed send monitor.
140216 0:25:02 [Note] WSREP: gcomm: terminating thread
140216 0:25:02 [Note] WSREP: gcomm: joining thread
140216 0:25:02 [Note] WSREP: gcomm: closing backend
140216 0:25:02 [Note] WSREP: view(view_id(NON_PRIM,0aaf4462-96a0-11e3-a1d0-a3367b5dbc93,178) memb {
ae5e2351-96a0-11e3-8c77-ce4c363e1cf4,0
} joined {
} left {
} partitioned {
0aaf4462-96a0-11e3-a1d0-a3367b5dbc93,0
458976e6-96a0-11e3-9a4d-d6f1862200ab,0
})
140216 0:25:02 [Note] WSREP: view((empty))
140216 0:25:02 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
140216 0:25:02 [Note] WSREP: gcomm: closed
140216 0:25:02 [Note] WSREP: Flow-control interval: [512, 512]
140216 0:25:02 [Note] WSREP: Received NON-PRIMARY.
140216 0:25:02 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 45947383)
140216 0:25:02 [Note] WSREP: Received self-leave message.
140216 0:25:02 [Note] WSREP: Flow-control interval: [512, 512]
140216 0:25:02 [Note] WSREP: Received SELF-LEAVE. Closing connection.
140216 0:25:02 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 45947383)
140216 0:25:02 [Note] WSREP: New cluster view: global state: 64e060fb-10bd-11e3-0800-8ac8783f0ec6:45947383, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 2
140216 0:25:02 [Note] WSREP: RECV thread exiting 0: Success
140216 0:25:02 [Note] WSREP: Setting wsrep_ready to 0
140216 0:25:02 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140216 0:25:02 [Note] WSREP: New cluster view: global state: 64e060fb-10bd-11e3-0800-8ac8783f0ec6:45947383, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 2
140216 0:25:02 [Note] WSREP: Setting wsrep_ready to 0
140216 0:25:02 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140216 0:25:02 [Note] WSREP: applier thread exiting (code:0)
140216 0:25:02 [Note] WSREP: closing applier 2
140216 0:25:02 [Note] WSREP: wsrep running threads now: 4
140216 0:25:02 [Note] WSREP: recv_thread() joined.
140216 0:25:02 [Note] WSREP: Closing replication queue.
140216 0:25:02 [Note] WSREP: Closing slave action queue.
140216 0:25:02 [Note] WSREP: applier thread exiting (code:6)
140216 0:25:02 [Note] WSREP: closing applier 22
140216 0:25:02 [Note] WSREP: wsrep running threads now: 3
140216 0:25:02 [Note] WSREP: applier thread exiting (code:6)
140216 0:25:02 [Note] WSREP: closing applier 20
140216 0:25:02 [Note] WSREP: wsrep running threads now: 2
140216 0:25:02 [Note] WSREP: applier thread exiting (code:6)
140216 0:25:02 [Note] WSREP: closing applier 21
140216 0:25:02 [Note] WSREP: wsrep running threads now: 1
140216 0:25:03 [Note] WSREP: Before Lock_thread_count
140216 0:25:04 [Note] WSREP: waiting for client connections to close: 22
140216 0:25:04 [Note] WSREP: closing wsrep thread 22
140216 0:25:04 [Note] WSREP: closing wsrep thread 21
140216 0:25:04 [Note] WSREP: closing wsrep thread 20
140216 0:25:04 [Note] WSREP: closing wsrep thread 2
140216 0:25:04 [Note] WSREP: closing wsrep thread 1
140216 0:25:04 [Note] WSREP: WSREP rollback thread wakes for signal
140216 0:25:04 [Note] WSREP: WSREP rollback thread has empty abort queue
140216 0:25:04 [Note] WSREP: rollbacker thread exiting
140216 0:25:04 [Note] WSREP: wsrep running threads now: 0
140216 0:25:04 [Note] Event Scheduler: Purging the queue. 0 events
140216 0:25:06 [Note] closing wsrep system thread
140216 0:25:06 [Note] closing wsrep system thread
140216 0:25:06 [Note] closing wsrep system thread
140216 0:25:06 [Note] closing wsrep system thread
140216 0:25:06 [Note] closing wsrep system thread
It seems to get stuck at this point and nothing else happens until I kill -9 the process. The cluster was upgraded with socket.checksum=1 (from 5.5.34) and I then disabled that option.
my.cnf:
[mysqld]
collation-server = utf8_unicode_ci
init-connect = 'SET NAMES utf8'
character-set-server = utf8
wsrep_cluster_address = 'gcomm://XXXX001,XXXX002,XXXX003'
wsrep_node_address = XXXX003
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_sst_method = rsync
wsrep_sst_receive_address = XXXX003
wsrep_slave_threads = 4
wsrep_log_conflicts = 1
wsrep_retry_autocommit = 3
wsrep_provider_options = gcs.fc_limit=512; gcs.fc_master_slave=YES; gcs.fc_factor=1.0; gcache.size=5G;
datadir = /var/lib/mysql
default-storage-engine = InnoDB
user = mysql
max_allowed_packet = 16M
max_connect_errors = 1000000
transaction-isolation = REPEATABLE-READ
innodb_max_dirty_pages_pct = 30
innodb_file_per_table = 1
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 2
innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode = 2
innodb_print_all_deadlocks = 1
innodb_buffer_pool_instances = 4
innodb_buffer_pool_size = 11G
innodb_buffer_pool_populate = 1
innodb_file_format = Barracuda
innodb_thread_concurrency = 0
innodb_log_file_size = 64M
innodb_io_capacity = 300
innodb_read_io_threads = 32
innodb_write_io_threads = 32
innodb_flush_neighbor_pages = area
innodb_open_files = 600
thread_handling = pool-of-threads
thread_pool_size = 8
thread_pool_stall_limit = 500
thread_pool_max_threads = 500
thread_pool_idle_timeout = 60
extra_port = 63306
extra_max_connections = 5
tmpdir = /tmp
tmp_table_size = 32M
max_heap_table_size = 32M
query_cache_type = 0
query_cache_size = 0
max_connections = 300
thread_cache_size = 50
open_files_limit = 65535
table_definition_cache = 4096
table_open_cache = 16384
binlog_format = ROW
max_binlog_size = 100M
expire_logs_days = 1
log-bin = /var/log/mysql/mariadb-bin.log
slow_query_log = 1
long_query_time = 2
slow_query_log_file = /var/log/mysql/mariadb-slow-queries.log
general_log = 0
log-error = /var/log/mysql/mariadb-error.log
log_slow_verbosity = Query_plan
plugin-load = handlersocket.so
loose_handlersocket_port = 9998
loose_handlersocket_port_wr = 9999
loose_handlersocket_threads = 16
loose_handlersocket_threads_wr = 1