[MDEV-37096] Flow control triggered in IST - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.6.20
Fix Version/s: None
Component/s: Galera
Labels:
None
Environment:
rhel 8

Description

We have a Galera replication cluster with 2 DB nodes and 1 arbitractor. We use Haproxy to redirect all transaction to DB node 1 as a primary. While DB node 2 serves as a slave.

Scenario:

We have jMeter sending transactions to DB node 1. It keeps running during the scenario.
Then after 10 min running the jMeter, we shutdown DB node 2 VM.
Wait 30 min.
Resume DB node 2 VM and it's DB server.
DB node 2 starts recovering with IST.
We monitor the writeset queue size and flow control in DB node 2. We have gcs.fc_limit=500.
The writeset queue size is increasing and increase to about 1 million after about 10 min. There is no flow control happened.
Then the writeset queue size starts decreasing (I guess it is processing the IST).
Before DB node 2 finishes IST, we notice that it triggers flow control from time to time.

I would like to know the flow control triggered during the IST is normal or not. My concern is that it doesn't trigger flow control when the writeset queue is increasing and exceeded the fc_limit but it will have flow control when the queue is decreasing. Also I would be appreciated if someone can explain the machanism of Galera recovering (IST and SST) like this situation as I can't find related information from the web.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Chow King Tak

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2025-06-27 09:19

Updated:: 2025-06-27 09:19

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.