[MDEV-32644] Possible memory leak with 0 active connection , Memory utilization is high Created: 2023-11-01  Updated: 2023-12-29  Resolved: 2023-12-15

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.5.22
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Pramod Mahto Assignee: Kwangbock Lee
Resolution: Incomplete Votes: 0
Labels: triage


 Description   

With no active connection to (idle) MariaDB server 10.5.22 still the memory usage is high:-

Checking on the OS side we it goes up to 45GB aprox. There is like 10GB no explanations.

cat /proc/$mariadb_pid/status

VmPeak: 56917272 kB        54 GB
VmSize: 55658852 kB        53 GB
VmHWM:  49557112 kB     47 GB 
VmRSS:  48393924 kB        46 GB
 
While Total large memory allocated 35433480192        35 GB

While testing with other versions with default parameters fresh installation without any user's DB/Tables :-

MariaDB 10.5.9

[root@mariadb_stand ~]# cat /proc/3240/smaps | grep -i pss |  awk '{Total+=$2} END {print Total/1024" MB"}'
 
75.8076 MB
 
Total large memory allocated 160 MB 
Buffer pool  128 MB 
 
 
After doing some transactions , bulk insert :- 
 
[root@mariadb_stand ~]# cat /proc/3240/smaps | grep -i pss |  awk '{Total+=$2} END {print Total/1024" MB"}'
291.79 MB

10.5.17

[root@mariadb_stand ~]# cat /proc/2407/smaps | grep -i pss |  awk '{Total+=$2} END {print Total/1024" MB"}'
 
292.969 MB
 
Total large memory allocated  160 MB 
Buffer pool  128 MB 

10.5.22

[root@mariadb_stand ~]# cat /proc/3587/smaps | grep -i pss |  awk '{Total+=$2} END {print Total/1024" MB"}'
206.811 MB
 
Total large memory allocated 160 MB 
Buffer pool  128 MB 



 Comments   
Comment by Sergei Golubchik [ 2023-11-03 ]

Is that memory usage on startup? That is, you start the server — no connections yet — and it takes 35GB?

Comment by MikaH [ 2023-11-06 ]

What is the OS and version?

Comment by Vignesh Viswanathan [ 2023-11-08 ]

Hi,

We are observing a similar kind of high memory utilization issue in the MariaDB 10.5.11 deployed on a Windows Server 2019 Standard containing 16 GB RAM , which leads the server becoming unusable till a reboot. This repeats every 14 days.

There are indications in the Application event logs in the server of a leaking memory process - consuming all available memory and causing multiple applications to crash due to System.OutOfMemoryException. Microsoft's RADAR_PRE_LEAK_64 entry in the Application Log is indicating mysqldb to be a the probable cause.

We found the below entry in the system event log :
Event ID 2004 - Windows successfully diagnosed a low virtual memory condition. The following programs consumed the most virtual memory: mysqld.exe (2968) consumed 2546798592 bytes, firefox.exe (4008) consumed 300097536 bytes, and w3wp.exe (11088) consumed 285507584 bytes.

Please let us know if any more details are required from our end. Or should we raise a new bug for this ?

Thanks,
Vignesh

Comment by Vladislav Vaintroub [ 2023-11-09 ]

VVignesh, could you use UMDH, to find out the cause of the memory leak? The tool is described here https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/using-umdh-to-find-a-user-mode-memory-leak. There is a blog entry https://planet.mysql.com/entry/?id=15423 that describes how to analyze Windows memory leaks, for mysqld.exe process. It is a 15 years old article, some things work differently (apparently the download of Debugging tools for Windows) , but the otherwise the description is still accurate.
_NT_SYMBOL_PATH environment variable should be set to "https://msdl.microsoft.com/download/symbols;C:\Program Files\MariaDB 10.5\bin" for the symbol resolution.

Comment by Michael Widenius [ 2023-11-29 ]

mysqld.exe (2968) consumed 2546798592 bytes
This is 2.3G, not even close to 16G.
firefox and w3wp.exe consumes totally 0.5G. Note that firefox can easily consume gigabytes of memory if one watches videos, read emails, have many tables open etc. I have on my system seen it consume 35G of memory several times.
VVignesh what are the sizes of of your page buffers? Can you please add your my.cnf or the output from "mariadbd --print-defaults" to this MDEV?

Comment by Michael Widenius [ 2023-11-29 ]

It is normal that MariaDB does not release memory back to the OS. The memory will be reused be future queries.
This is one of the drawbacks with the default OS memory allocate and using threads. Memory becomes fragmented and the normal
way to release memory back to the OS does not always work. This can be seen also with firefox which process constantly grow over time (it does grow and shrink but the 'minum memory used' is constantly increased)
That is why one should always strive to have a dedicated server running MariaDB.

If using MariaDB on Linux, one can enable jemalloc to get better control of the memory allocations. It may be that jemalloc is also better in giving back some memory to the OS thanks to less memory fragmentation. See https://mariadb.com/kb/en/using-mariadb-with-tcmalloc-or-jemalloc/ for details of how to enable it.

Comment by Michael Widenius [ 2023-11-29 ]

I did a quick check of memory usage of MariaDB on my development machine:

Testing the memory used on my development system running a debug version of MariaDB 10.6.16:

((/my/maria-10.6/sql)) ls -lh mariadbd
-rwxrwxr-x 1 monty users 250M Nov 28 19:09 mariadbd
((/my/maria-10.6/sql)) size mariadbd
text data bss dec hex filename
30028640 3788952 9348736 43166328 292aa78 mariadbd

The above numbers are big as this is a debug binary with a lot of symbols.
However this should not have any impact of memory allocated by MariaDB:

MariaDB [test]> show global status like "Innodb_buffer_pool_bytes_data";
--------------------------------------+

Variable_name Value

--------------------------------------+

Innodb_buffer_pool_bytes_data 5767168

--------------------------------------+
MariaDB [test]> show global variables like "thread_stack";
---------------------+

Variable_name Value

---------------------+

thread_stack 299008

---------------------+
MariaDB [test]> show global status like "Memory_used";
------------------------+

Variable_name Value

------------------------+

Memory_used 332823400

------------------------+
MariaDB [test]> show global variables like "innodb_buffer_pool_size";
----------------------------------+

Variable_name Value

----------------------------------+

innodb_buffer_pool_size 134217728

----------------------------------+
MariaDB [test]> show global variables like "%buffer_size";
-------------------------------------+

Variable_name Value

-------------------------------------+

aria_pagecache_buffer_size 8388608
key_buffer_size 16777216
s3_pagecache_buffer_size 134217728

-------------------------------------+

MariaDB [test]> select 134217728+16777216+8388608+134217728;
--------------------------------------

134217728+16777216+8388608+134217728

--------------------------------------

293601280

--------------------------------------

MariaDB [test]> select (134217728+16777216+8388608+134217728)/1024/1024;
--------------------------------------------------

(134217728+16777216+8388608+134217728)/1024/1024

--------------------------------------------------

280.00000000

+--------------------------------------------------

This shows that the most of the allocated memory are assigned to storage engine buffers.
Each thread will use 292K of stack. After startup we have created 13 threads, taking up
temporary 13*292 = 3.7M of memory

ps shows 1205M of memory usage for the mariadbd process. Some of this is virtual mapped
memory for the mariadbd binary and the shared libraries used. It is a bit bigger

cat /proc/34213/status | grep Vm
VmPeak: 1233876 kB
VmSize: 1168552 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 125996 kB
VmRSS: 109876 kB
VmData: 286648 kB
VmStk: 136 kB
VmExe: 29328 kB
VmLib: 22280 kB
VmPTE: 492 kB
VmSwap: 0 kB

Vmdata kind of matches that above numbers.

I did a quick search on Google:
https://stackoverflow.com/questions/17174645/vmsize-physical-memory-swap

"VMsize is the "address space" that the process has in use: the number
of available adresses. These addresses do not have to have any
physical memory attached to them. (Attached physical memory is the RSS
figure)"

The fact that we are using threads, may cause VmSize to be high, but as
seen above, this has nothing to do with memory allocation.

((/my/maria-10.6/sql)) cat /proc/35413/smaps | grep -i pss | awk '

{Total+=$2}

END

{print Total/1024" MB"}

'
100.49 MB

top shows when mariadbd is running:
MiB Mem : 257398.5+total, 3290.691 free, 38741.60+used, 228462.2+buff/cache
MiB Swap: 257431.0+total, 257408.3+free, 22.750 used. 218656.9+avail Mem

without mariadbd:
MiB Mem : 257398.5+total, 3394.430 free, 38637.96+used, 228462.1+buff/cache
MiB Swap: 257431.0+total, 257408.3+free, 22.750 used. 218760.5+avail Mem

Restarted:
MiB Mem : 257398.5+total, 3304.129 free, 38729.41+used, 228457.4+buff/cache
MiB Swap: 257431.0+total, 257408.3+free, 22.750 used. 218669.0+avail Mem

Which shows a 100M difference (which is a bit too small should be closer to 200M, but still
nothing in Gigabyte ranges).

The main point is that I cannot see any extended memory usage on my machine, so there is no way to repeat it.
(Note that VMsize is not allocated memory!)

Please include full configuration of your MariaDB server so that we can analyze what is happening.
If possible, please add monitoring of memory usage of both the server and mariadbd during the period of memory usage 'spikes'.
Knowing what causes memory usage will help find a solution for the problem.

When monitoring MariaDB, running the following queries every 15 second or every minute would probably help locate if the problem is in MariaDB:
show global status like "Memory_used";
select * from information_schema.processlist;

In processlist, especially the column MAX_MEMORY_USED should be helpful

Comment by Michael Widenius [ 2023-11-30 ]

If possibly, you can try disabling hugepages as it may disable malloc from freeing free memory to the system:

echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/enabled

See https://blog.nelhage.com/post/transparent-hugepages/ for more information

Generated at Thu Feb 08 10:32:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.