Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.1.21
-
None
-
None
-
None
-
OS: CentOS 7
Physical/Virtual: Can reproduce on both vm and hardware, but primarily on physical. 1TB memory, dual 12x core CPU.
Description
I have a 720GB database, 808GB innodb_buffer_cache, 1TB system memory and 10 tables. For all I can figure we shouldn’t be running out of memory, however, the mysqld process steadily increases in memory consumption until OOM killer rightfully claims it (we now restart it before that happens). After finding some relevant memory leak fixes released in 10.1.21 we upgraded to that from 10.1.13, but unfortunately it didn’t help.
Running mysqltuner.pl (https://github.com/major/MySQLTuner-perl) seems to agree that we shouldn’t be using more than 809 GB.
mysqltuner snippet |
[--] Total buffers: 806.4G global + 2.9M per thread (1000 max threads)
|
[OK] Maximum reached memory usage: 808.2G (80.21% of installed RAM)
|
[OK] Maximum possible memory usage: 809.2G (80.31% of installed RAM)
|
At the time this data was gathered, the mysqld process was at 980GB. (gathered with `ps -eo pid,comm,rss`), leaving the available system memory at 17GB.
I can consistently reproduce this with similar workload on a scaled down test instance.
I have attached a binlog from my test instance, show variables, and a few grafana graphs.
Thanks for taking a look!
Disclaimer: this is a rough concept only that require testing in a non-production environment obviously.
Disclaimer 2: This is general advice from a community user and I also don't have much time but feel your pain enough to offer this on an as is basis.
I'd be looking at something like this to use gdb to attach to the running process and print out information http://www.outofcore.com/2011/06/scripted-debug-using-gdb/.
So trapped points would be (from source mysys/my_malloc.c):
https://github.com/MariaDB/server/blob/a14638581b4c8ef175e68dccff07967d819b3b7e/mysys/my_malloc.c#L130
Ideally logging a returned (using ``print pointer, my_flags, size``) pointer, flags and size from malloc/realloc calls, the oldptr from the realloc call, and the ptr argument from the free call. At each of these breakpoints logging the backtrace of the current thread. Hopefully the output from this with a bit of analysis can identify the backtrack where memory is allocated by not freed.
gdb can attach to a process (using -p) and there is a batch mode for the script.
I feel your pain about hard to debug memory leaks (mine was my second bug report
MDEV-4974). Wish I'd though of this at that time (but I didn't know enough then).PS. If you do write the gdb script please include if for someone else