[MDEV-34920] Galera: History list length clearing after a very long query causes significant cluster stall - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.6.16
Fix Version/s: 10.6
Component/s: Galera
Labels:
None

Description

Environment - 3-node Galera cluster, Maxscale r/w splitting. CentOS 7.9, 10.6.16 Community Edition.

A busy database (continual traffic of 1,000+ updates/second), where a very long (5hrs+) query runs daily at ~3am on a "reader" node.

When this query completes, history list length is usually up in the 700,000+ range.

Some time after the query completes, the entire cluster stops processing transactions as the DB where this query was running stops everything it's doing while it's clearing out the undo data. It stops responding to anything so commits pile up on the writer node until the reader finishes:

I've previously tinkered with the InnoDB purge settings but to no real effect.

I know there are changes in later 10.6 versions related to undo retention in busy databases, but I'm not clear on whether they would have any impact on this.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

image-2024-09-13-10-34-00-428.png
284 kB
2024-09-13 09:34
image-2024-09-13-10-36-31-371.png
210 kB
2024-09-13 09:36
image-2024-09-13-10-39-19-780.png
117 kB
2024-09-13 09:39

Activity

People

Assignee:: Julius Goryavsky (Inactive)

Reporter:: Phil Sumner

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2024-09-13 09:41

Updated:: 2025-02-06 21:39

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.