[MXS-4161] MaxScale System Diagnostics - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 22.08.2
Component/s: Core
Labels:
None

Sprint:
MXS-SPRINT-166

Description

MaxScale should provides means using which it is easy to ascertain that the configuration is compatible with the resources available to MaxScale. Especially when MaxScale is running in a container, it is possible that the resources - cores and memory - are limited compared to what is available on the machine. If that is the case and the automatic configuration (in particular threads and query_classifier_cache_size) is reliead upon, then MaxScale may end up using far more resources than what is available, with crashes as the result.

Original description
================
MaxScale Allocated Memory Usage Estimation

While ~~MXS-3822~~ will tackle showing customers the current memory usage MaxScale is causing, we should also add an additional feature which shows customers "allocated" memory usage. This would be more of an estimate, but would give users direct feedback on how their MaxScale configuration creates the potential for memory usage.

For example, we know that query_classifier_cache_size is pretty straightforward. We also know that each thread spawned by threads=N has its own cache which comes with it. We should be able to show a worst-case estimate of memory usage based on this information.

Why?

MaxScale has many automatic configuration parameters (such as threads=auto or query_classifier_cache_size defaulting to 15% of detected memory). In most cases, these work well and set sensible defaults. However, in some cases other factors obstruct MaxScale's ability to properly detect underlying resources and default memory allocation can be overzealous. This behavior is non-obvious to most customers, and because MaxScale does not immediately use these memory allocations when it first starts up, many customers end up complaining about "memory leaks" or other "memory problems" as connections come in and MaxScale begins actively using the memory its configurations allow it to.

By showing customers clearly what MaxScale is configured to use, it will clue customers immediately in when something is wrong at a configuration level, so instead of customers immediately assuming a leak or other problem is occurring, customers will instead ask MariaDB why MaxScale is expecting to use so much more memory than the customer has allocated to the VM/node/etc, which creates a much more productive conversation with MariaDB.

Starting Point

This request could be seen as all-encompassing which could make it difficult to implement properly. For example, calculating expected memory usage for various filters may be impossible or extremely difficult. Likewise for estimating memory usage based on a potentially infinite number of incoming connections.

Initial feature delivery could dodge complicated situations by scoping appropriately and communicating that scoping to users.

For connection-count-based estimations, a simple answer to start could be to let the customer enter that value so customers can see expected worst-case memory usage (excepting impact of filters) for various concurrent connection targets which customers could then use for planning purposes. Down the road, it may be feasible to harvest or prepopulate sensible values based on backend servers' max_connections or based on the configuration of max_routing_connections when that is enabled.

Attachments

Issue Links

relates to

MXS-3822 MaxScale Global Memory Use Indicator

Closed

Activity

People

Assignee:: Johan Wikman

Reporter:: Rob Schwyzer

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2022-06-07 20:36

Updated:: 2024-07-07 16:39

Resolved:: 2022-09-22 09:14

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.