[MXS-5339] Slow servers may cause OOM situations if prepared statements are used - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 21.06.17, 22.08.14, 23.02.11, 23.08.7, 24.02.3, 24.08.0
Fix Version/s: 21.06.18, 22.08.15, 23.02.12, 23.08.8, 24.02.4, 25.01.1
Component/s: Protocol, readwritesplit
Labels:
None

Description

If one of the backend servers is very slow and a connection pool uses a lot of prepared statements, the slow server will collect a backlog of prepared statements that it hasn't executed.

This can happen in cases where the primary node of the cluster is a lot closer to the MaxScale instance than some of the replicas are. The response to a COM_STMT_PREPARE will be returned very quickly from the primary and the client is free to execute more commands that may be propagated to all servers. This is further made worse by the fact that readwritesplit prefers idle servers over busy ones which means that a backend with a long backlog may never get any read traffic on it. If it did get some traffic, it would cause a "synchronization" of the backend to happen as a read on a backend with queued commands forces the queue to be executed before the read can happen.

One solution to this would be to close the connections to some servers if the backlog gets too long. Another one would be to simply ignore the idleness of the backends and route reads to servers that are executing queries. The latter approach would solve the problem in most cases but it would cause latency spikes and is not 100% guaranteed to solve it.

Attachments

Activity

People

Assignee:: markus makela

Reporter:: markus makela

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2024-10-09 08:40

Updated:: 4 days ago 11:47

Resolved:: 2024-10-13 07:05

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.