Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.6.11
-
None
-
None
-
Prod
Description
Galera Cluster does not mark lagging node as non-primary, wsrep_local_state_comment shows synced status. Entire cluster hangs with TOI.
We have a 3-node galera cluster on the primary site. There is another 3-node galera cluster in a DR site with binlog replication happening between node 1(master node) of primary cluster to node 1 of DR cluster . Node 1 has pc.weight set as 2, node 2 has it as 1 and node 3 has it set to 0 in wsrep_provider_options.
We have observed that sometimes, one of the nodes ( even one with pc.weight = 1 or 0), lags behind in the cluster, shows wsrep_last_committed value less than the other two nodes and shows a high wsrep_local_recv_queue value but still it is NOT marked as NON-Primary component. The other nodes are waiting on the lagging node. And all the 3 nodes are hung, transactions are waiting forever either on commit or on "acquiring total order isolation" (sometime due to a truncate which is not the original offender). Surprisingly, 'wsrep_cluster_status' is shown as Primary for all nodes, wsrep_cluster_size shows 3 , wsrep_local_state_comment shows "synced" on all the nodes, all the nodes are wsrep_ready=yes and wsrep_connected=yes. The value for wsrep_local_recv_queue on the lagging node > 1 but the wsrep_last_committed value remains frozen. No errors are shown in mysqld log. Issue does not get resolved unless we bounce the problematic node and in some cases the entire cluster.
Also, DML (especially deletes and updates) replication across cluster nodes is very slow and a delete of 10k rows takes 2 mins and update takes 4 mins to sync up across the all the nodes. Tried with higher values for evs.send_window and wsrep_slave_threads, still there is no change in performance.
All the servers involved are 4 CPU and 32 GB RAM. RTT under 0.3 ms between nodes.
rtt min/avg/max/mdev = 0.231/0.262/0.275/0.027 ms
rtt min/avg/max/mdev = 0.198/0.250/0.291/0.043 ms
rtt min/avg/max/mdev = 0.206/0.233/0.250/0.019 ms
My.cnf values for 1st node -
node1 |
# this is only for the mysqld standalone daemon
|
[mysqld]
|
datadir=/data/mariadata/mysql
|
socket=/data/mariadata/mysql/mysql.sock
|
log_error=/data/mariadata/log/mysqld.log
|
lower_case_table_names = 1
|
log_bin_trust_function_creators = ON
|
max_connections=1000
|
|
binlog_format=row
|
default_storage_engine=InnoDB
|
innodb_autoinc_lock_mode=2
|
innodb_flush_log_at_trx_commit=0
|
innodb_buffer_pool_size=24G
|
|
#replication
|
server_id=1
|
gtid_domain_id =21
|
log_bin=/data/mariadata/binlogs/mariadb-bin
|
relay_log=/data/mariadata/relaylogs/relay-bin
|
log_slave_updates=ON
|
expire_logs_days = 7
|
|
#
|
# * Galera-related settings
|
#
|
[galera]
|
# Mandatory settings
|
|
#Galera provider Configuration
|
wsrep_on=ON
|
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
|
# Optional settings
|
wsrep_slave_threads=4
|
wsrep_provider_options="gcache.size=500M;gcache.page_size=500M;pc.weight=2"
|
|
#Galera cluster configuration
|
wsrep_cluster_name="galera-dev"
|
wsrep_cluster_address="gcomm://ip1, ip2, ip3"
|
|
#Galer Node Configuration
|
wsrep_node_name="hostname1"
|
wsrep_node_address="ip1"
|
|
#
|
# Allow server to accept connections on all interfaces.
|
#
|
bind-address=0.0.0.0
|
#
|
#Galera sst configuration
|
wsrep_sst_method=rsync
|
|
#replication
|
wsrep_gtid_mode=ON
|
wsrep_gtid_domain_id=5
|
|
# this is only for embedded server
|
[embedded]
|
Attaching the session logs taken from Galera nodes and the mysqld log files.