[MDEV-33166] Misleading wsrep_cluster_size value immediately after starting mariadb service Created: 2024-01-03  Updated: 2024-01-03

Status: Open
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.6.16
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Karl Levik Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Environment:

RHEL7



 Description   

I have a medium/large database running on MariaDB Galera with 3 nodes on RHEL7, and I just upgraded from 10.6.15 to 10.6.16.

While upgrading, I noticed something a bit strange:

Immediately after starting up the mariadb service (sudo systemctl start mariadb), after having upgraded the RPM packages, I ran the command/statement:

mariadb -e "show status where variable_name in ('wsrep_ready', 'wsrep_local_state_comment', 'wsrep_connected', 'wsrep_cluster_size');"

... which gave the following output:

+---------------------------+-----------------------------------+
| Variable_name             | Value                             |
+---------------------------+-----------------------------------+
| wsrep_local_state_comment | Joining: receiving State Transfer |
| wsrep_cluster_size        | 3                                 |
| wsrep_connected           | ON                                |
| wsrep_ready               | OFF                               |
+---------------------------+-----------------------------------+

The Knowledge Base defines the wsrep_cluster_size status variable as:

Number of nodes currently in the cluster.

I believe wsrep_cluster_size here is misleading. When I re-ran the command shortly after, it reported wsrep_cluster_size as 2, and it stayed at 2 (while the other status variables remained unchanged) until I finally got:

+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
| wsrep_cluster_size        | 3      |
| wsrep_connected           | ON     |
| wsrep_ready               | ON     |
+---------------------------+--------+

I reproduced the issue on another node when I did the same there.

Perhaps this is "as designed", but the value seems misleading to me. Or maybe I am just misunderstanding the purpose of this variable.


Generated at Thu Feb 08 10:36:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.