[MDEV-24472] Out of memory error despite system reporting 8GiB of available memory - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.4.17
Fix Version/s: None
Component/s: Server
Labels:
- galera
- innodb
Environment:
Debian 10.7: 4.19.0-11-amd64 #1 SMP Debian 4.19.146-1 (2020-09-17) x86_64 GNU/Linux

Description

Hello, this either smells like a bug, or I've been looking at the wrong places during troubleshooting.

This issue has manifested twice, once on 10.3 and once on 10.4.17. Environment is a three-node production galera cluster, and so far the errors in question have appeared on the node to which we're sending the writes (i.e. the "master").

At the time of the failure, the entire cluster seems to stall, and apps depending on database functions start to fail as well. We monitor several mariadb status vars (through a slightly modified https://github.com/uvoteam/mysql-monitoring), and only `Innodb_row_lock_waits` seems to significantly fluctuate while the issue is still manifesting. It lasted between 5-10 minutes. The mariadb process was not killed, the cluster self-healed without admin intervention and the only entry in the logs which seems relevant is several iterations of the following:

2020-12-22 10:19:26 753210 [Warning] Aborted connection 753210 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Out of memory.)

Our setup does involve a lot of short lived connections from php scripts (peaking at 4-5K during the day).

Our monitoring service indicated that the host had about 8GiB of available memory (although this includes filesystem page cache) at the moment these lines were printed in the error.log.

mariadb runs in a VM configured with 40GiB of RAM, which is presented on two virtual NUMA nodes (to mirror the VM host's hardware's architecture, hoping to improve performance by helping the kernel schedule processes more effectively while taking memory and cache locality under consideration).

Some relevant configuration options are the following:

./my.cnf:sort_buffer_size           = 4M

./my.cnf:bulk_insert_buffer_size        = 16M

./my.cnf:key_buffer_size                    = 128M

./my.cnf:myisam_sort_buffer_size        = 512M

./my.cnf:read_buffer_size           = 2M

./my.cnf:read_rnd_buffer_size   = 1M

./my.cnf:innodb_buffer_pool_size          = 24G

./my.cnf:innodb_log_buffer_size   = 8M

./my.cnf:key_buffer             = 16M

If there's any other information I can produce to determine if this is a mariadb-server bug or a configuration issue, do let me know. Thank you.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: George Diamantopoulos

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2020-12-22 16:19

Updated:: 2020-12-22 16:19

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.