[MDEV-32261] Galera Cluster does not mark lagging node as non-primary, wsrep_local_state_comment shows synced status. Entire cluster hangs with TOI. Created: 2023-09-27  Updated: 2023-09-27

Status: Open
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.6.11
Fix Version/s: None

Type: Bug Priority: Major
Reporter: PITTA NEELIMA Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Prod


Attachments: Zip Archive GaleraIssue.zip    

 Description   

Galera Cluster does not mark lagging node as non-primary, wsrep_local_state_comment shows synced status. Entire cluster hangs with TOI.

We have a 3-node galera cluster on the primary site. There is another 3-node galera cluster in a DR site with binlog replication happening between node 1(master node) of primary cluster to node 1 of DR cluster . Node 1 has pc.weight set as 2, node 2 has it as 1 and node 3 has it set to 0 in wsrep_provider_options.

We have observed that sometimes, one of the nodes ( even one with pc.weight = 1 or 0), lags behind in the cluster, shows wsrep_last_committed value less than the other two nodes and shows a high wsrep_local_recv_queue value but still it is NOT marked as NON-Primary component. The other nodes are waiting on the lagging node. And all the 3 nodes are hung, transactions are waiting forever either on commit or on "acquiring total order isolation" (sometime due to a truncate which is not the original offender). Surprisingly, 'wsrep_cluster_status' is shown as Primary for all nodes, wsrep_cluster_size shows 3 , wsrep_local_state_comment shows "synced" on all the nodes, all the nodes are wsrep_ready=yes and wsrep_connected=yes. The value for wsrep_local_recv_queue on the lagging node > 1 but the wsrep_last_committed value remains frozen. No errors are shown in mysqld log. Issue does not get resolved unless we bounce the problematic node and in some cases the entire cluster.

Also, DML (especially deletes and updates) replication across cluster nodes is very slow and a delete of 10k rows takes 2 mins and update takes 4 mins to sync up across the all the nodes. Tried with higher values for evs.send_window and wsrep_slave_threads, still there is no change in performance.

All the servers involved are 4 CPU and 32 GB RAM. RTT under 0.3 ms between nodes.
rtt min/avg/max/mdev = 0.231/0.262/0.275/0.027 ms
rtt min/avg/max/mdev = 0.198/0.250/0.291/0.043 ms
rtt min/avg/max/mdev = 0.206/0.233/0.250/0.019 ms

My.cnf values for 1st node -

node1

# this is only for the mysqld standalone daemon
[mysqld]
datadir=/data/mariadata/mysql
socket=/data/mariadata/mysql/mysql.sock
log_error=/data/mariadata/log/mysqld.log
lower_case_table_names = 1
log_bin_trust_function_creators = ON
max_connections=1000
 
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=24G
 
#replication
server_id=1
gtid_domain_id =21
log_bin=/data/mariadata/binlogs/mariadb-bin
relay_log=/data/mariadata/relaylogs/relay-bin
log_slave_updates=ON
expire_logs_days = 7
 
#
# * Galera-related settings
#
[galera]
# Mandatory settings
 
#Galera provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
# Optional settings
wsrep_slave_threads=4
wsrep_provider_options="gcache.size=500M;gcache.page_size=500M;pc.weight=2"
 
#Galera cluster configuration
wsrep_cluster_name="galera-dev"
wsrep_cluster_address="gcomm://ip1, ip2, ip3"
 
#Galer Node Configuration
wsrep_node_name="hostname1"
wsrep_node_address="ip1"
 
#
# Allow server to accept connections on all interfaces.
#
bind-address=0.0.0.0
#
#Galera sst configuration
wsrep_sst_method=rsync
 
#replication
wsrep_gtid_mode=ON
wsrep_gtid_domain_id=5
 
# this is only for embedded server
[embedded]

Attaching the session logs taken from Galera nodes and the mysqld log files.


Generated at Thu Feb 08 10:30:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.