[MDEV-30848] Memory leak in MariaDB 10.6 Created: 2023-03-14  Updated: 2023-11-28  Resolved: 2023-11-28

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.6.12
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Miroslav Lachman Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 3
Labels: crash, performance, regression
Environment:

FreeBSD 12.3-RELEASE-p10 amd64 GENERIC
Virtual Machine with 22GB of RAM + 8 vCPU


Attachments: PNG File 2023-06-30_durian_mysql_memory-month.png     PNG File Screenshot 2023-04-26 at 13.54.43.png    
Issue Links:
Relates
relates to MDEV-29988 Major performance regression with 10.... Closed
relates to MDEV-30096 Memory leak 10.3.37 and 10.6.11 Closed

 Description   

I already reported about "memory leaks" we have seen after upgrade from 10.3 to 10.6 in comments of bugs MDEV-30096 and MDEV-29988 but it seems there are still another one.
Our servers run with the same applications and the same workload for a long time but since we upgraded MariaDB to 10.6 we see OOM crashes too often and even if we already doubled the RAM sizes of affected VMs.

This machine has configuration where it can eat up max 17GB of memory (if this calculation is right and if there is 100 connections)

:mysql> SELECT @@key_buffer_size / 1048576,
    -> @@query_cache_size / 1048576,
    -> @@innodb_buffer_pool_size / 1048576,
    -> @@innodb_log_buffer_size / 1048576,
    -> @@max_connections,
    -> @@read_buffer_size / 1048576,
    -> @@read_rnd_buffer_size / 1048576,
    -> @@sort_buffer_size / 1048576,
    -> @@join_buffer_size / 1048576,
    -> @@binlog_cache_size / 1048576,
    -> @@thread_stack / 1048576,
    -> @@tmp_table_size / 1048576,
    -> ( @@key_buffer_size
    -> + @@query_cache_size
    -> + @@innodb_buffer_pool_size
    -> + @@innodb_log_buffer_size
    -> + @@max_connections * (
    ->  @@read_buffer_size
    ->  + @@read_rnd_buffer_size
    ->  + @@sort_buffer_size
    ->  + @@join_buffer_size
    ->  + @@binlog_cache_size
    ->  + @@thread_stack
    ->  + @@tmp_table_size )
    -> ) / (1024 * 1024 * 1024) AS MAX_MEMORY_GB\G
*************************** 1. row ***************************
        @@key_buffer_size / 1048576: 128.0000
       @@query_cache_size / 1048576: 1.0000
@@innodb_buffer_pool_size / 1048576: 1024.0000
 @@innodb_log_buffer_size / 1048576: 8.0000
                  @@max_connections: 100
       @@read_buffer_size / 1048576: 1.0000
   @@read_rnd_buffer_size / 1048576: 2.0000
       @@sort_buffer_size / 1048576: 8.0000
       @@join_buffer_size / 1048576: 2.0000
      @@binlog_cache_size / 1048576: 1.0000
           @@thread_stack / 1048576: 0.2852
         @@tmp_table_size / 1048576: 150.0000
                      MAX_MEMORY_GB: 17.1773

(is this calculation above right?)

But even if there is no more than 5 connections (listed by SHOW PROCESSLIST or tool mytop) MariaDB eats more than 18GB and than was killed by OOM.

Top usage taken 40 seconds before OOM kill

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
 1632 mysql        37  20    0    21G    18G select   3  44:45  62.89% /usr/local/libexec/mariadbd --defaults-extra-file=/usr/local/etc/mysql/my.cnf --basedir=/usr/local --datadir=/var/db/mysql --plugin-dir=/usr/local/lib/mysql/plugin --wsrep_on=1 --wsrep_provider=none --wsrep_on=1 --wsrep_provider=none --log-error=/var/db/mysql/mysql.err --pid-file=/var/run/mysql/mysqld.pid --socket=/var/run/mysql/mysql.sock --port=3306

If I will take the maximum connections seen today = 14 * 164MB per thread + 1161 MB global it should not take more than 3.5GB!
We have graphs in monitoring where we see connections (max 14) and queries per second (49 avg, 330 max)

We see this behavior on other machines with MariaDB 10.6.12 too. The more RAM we add to the VM the more MariaDB uses and crashes often.

What is the right way to identify what in MariaDB eats so much memory and not release it? (There are no long running persistent connections but periodic tasks running from cron)



 Comments   
Comment by Miroslav Lachman [ 2023-03-15 ]

We added another 12GB of memory (34GB total), but MariaDB eats it all again and crashed 3 times today (killed by OOM).

Mar 15 10:43:03 durian kernel: pid 1646 (mariadbd), jid 0, uid 88, was killed: out of swap space
Mar 15 11:44:11 durian kernel: pid 92152 (mariadbd), jid 0, uid 88, was killed: out of swap space
Mar 15 13:56:49 durian kernel: pid 11875 (mariadbd), jid 0, uid 88, was killed: out of swap space

When I last seen it before crash mariadbd uses 28GB of memory with just a 2 active connections listed by PROCESSLIST. Then it eats more memory and get killed.

It is really out of defined limits.

Comment by Miroslav Lachman [ 2023-03-27 ]

Nobody is interested in this serious regression bug?

The machine has 46GB of RAM configured but MariaDB is still crashing each day with "out of memory".

Comment by Vincent Jancso [ 2023-04-26 ]

We are also experiencing some memory leaks after upgrading from MariaDB 10.3 to 10.6.12 running on FreeBSD 12.4.

Comment by Daniel Black [ 2023-07-10 ]

Can you try to enable the performance schema and the [instruments for memory](https://mariadb.com/kb/en/performance-schema-tables/)?

https://mariadb.com/kb/en/memory-is-leaking/#performance-schema

Comment by Michael Widenius [ 2023-09-07 ]

It would have helped a lot get information about where memory is used and what kind of queries where running.
Especially for the case of 2 threads using a lot of memory, having the output from:

select * from information_schema.processlist;
show global status like "%mem%";

Could help in finding out what is wrong.

Regarding the calculation of memory used, some of the variables can be used multiple times for a query.
Still that does not explain your problems with 2 threads

Comment by Miroslav Lachman [ 2023-11-28 ]

I am sorry for a long time without reply. Out problem was probably solved by upgrade to MariaDB 10.6.13.
No more crashes by memory leaks.

Generated at Thu Feb 08 10:19:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.