[MDEV-15344] Huge memory usage on Maria 10.2.x cPanel Created: 2018-02-18 Updated: 2021-01-11 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Debug |
| Affects Version/s: | 10.2.13 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Neso | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 11 |
| Labels: | None | ||
| Environment: |
Centos 7.4 with CloudLinux - cPanel normal setup |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
Hi, after we upgraded our servers to use latest 10.2 version it started to use all RAM in very short time on some servers, on older relese on same server there was never problems, even servers with 256GB of RAM have problems now and we never expected this. For example we moved server from 64 to 128GB of RAM and now RAM usage is over 100GB where on 64 it was about 50% with all same sites, same setup, only diffrent release of Maria. Only what I found similar is this: Any help advice what to do, because, we need to restart MySQL on same servers every several days because of this. In attachment is my.cnf file Again this is normal cPanel setup, on CentOS 7.4 and all memory problems started after upgarde to 10.2 and we never have this kind of problems with any older version of Maria Any help would be great. |
| Comments |
| Comment by Elena Stepanova [ 2018-02-18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Which version did you upgrade from? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Neso [ 2018-02-18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Which version did you upgrade from? - From 10.1x release | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2018-02-18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for the answers. I assume you have run mysql_upgrade after upgrading from 10.1 to 10.2? By workflow on the server I meant MariaDB server – that is, what kind of statements are being executed, are those mostly writes or reads, InnoDB or MyISAM, direct or prepared statements or stored procedures etc. Usually when a regression occurs, it's bound to some particular kind(s) of SQL. But this question is mainly relevant when we are talking about really short periods of time, minutes, hours at most. Of course, if it's days, there can be anything. svoj, would you be able to work with Neso to get access to a box and see what's happening there? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Neso [ 2018-02-18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Yes, upgrade is done over cPanel standard upgrade process from 10.1x to 10.2.x We did try to do some tweeks to reduce memory usage on some servers and it did help, they work most of time without need to restart, but still usage is extremely high, regardless how much server has memory. Here screenshot from Munin: https://s3.amazonaws.com/upload.screenshot.co/e14a4a767c Tnx for help. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eri R Bastos [ 2018-04-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
We are seeing something similar. Here is an example of a single physical server running 6 instances of MariaDB. The first 5 are 10.2 and the last one is 10.1
In our case we are using MYISAM as the engine and we have about 2 million tables per instance. We need to restart the daemon every few days to keep them from OOM. I am trying to create a synthetic way to reproduce the problem for a few days now, but no luck yet. Not sure if this is related, but looks like 10.2 moved away from jemalloc: Side note: We do not use cPanel | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tom Parrott [ 2018-07-02 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I too am experiencing the same issues on CentOS 6.9 and MariaDB 10.2.15 and 10.2.16 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tom Parrott [ 2018-07-02 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Looks like its worse on servers with lots of tables...perhaps an innodb data dictionary leak? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tom Parrott [ 2018-07-02 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I'm trying it with jemalloc to see if it helps: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Steven Irwin [ 2018-07-29 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Same problem here. Mine is a GoDaddy VPS on Centos 6 with around 10 WordPress sites. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tom Parrott [ 2018-07-29 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Installing jemalloc from the MariaDB repo, and then adding this to server.cnf fixed it: [mysqld_safe] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Steven Irwin [ 2018-07-29 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Unfortunately didn't fix it for me. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tom Parrott [ 2018-07-31 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Could you post the output of free -m both after initially restarting and just before it uses all memory. When I had problems before I found that the cached memory was very low, whereas switching to jemalloc improved the situation. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nikolas Hermann [ 2018-07-31 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I Had the same problem after upgrading from 10.1.x to 10.2.x. (Galera, InnoDB only) @Steven Irwin: Could you verify that jemalloc is being used?
If it is not, there is probally a init-system ignoring your [mysqld_safe]-options (systemd...), check the mariadb-service-convert script. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Steven Irwin [ 2018-07-31 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Startup log: Startup Log This is a production web server with moderate traffic levels. The memory leak is slower since I removed clamav and spamassassin services as I noticed both were faulting as memory decreased. Being a production server it is not easy to debug. I did notice that the log_warnings was set to 2 by default after the upgrade and there were quite a few aborted connections. Setting the log level to 1 stopped these (as suggested by another post - suspect they are still happening but as this is a VPS with database colocated I am not sure why). I will keep looking in logs and testing. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Steven Irwin [ 2018-07-31 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This is free -m after reboot: total used free shared buffers cached I will post when memory depleted | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Evgenij [ 2018-09-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hello. I ran into the same problem. I have a server based on CloudLinux 6 (this is CentOS 6), I switched from MySQL 5.6 to MariaDB 10.2.17, and then noticed an increased consumption of RAM. I found a fairly simple way to reproduce the problem. If the size of the database exceeds the amount of RAM, then it is enough to run the mysqltuner.pl script 3-5 times and the amount of memory consumed becomes much larger than specified in the my.cnf limits. For example, the output mysqltuner.pl looks like this: However, if you run the script several times, the consumption of RAM increases to 1.1 gigabytes (with a limit of 562 megabytes). If you set the malloc-lib parameter to jemalloc, then the memory consumption does not exceed 300-400 megabytes, even if I run the tests more than 10 times. I hope that this information can help you. If you need any additional information, please let us know. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Dmitri [ 2018-12-11 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
same problem, when will be resolved? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by YURII KANTONISTOV [ 2018-12-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Reproducible on the Centos 7.4+MariaDB 10.2.19 lab server, same recipe as in Evgenij's comment: > I found a fairly simple way to reproduce the problem. If the size of the database exceeds the amount of RAM, then it is /etc/my.cnf: After few mysqltuner.pl runs like this: -bash-4.2$ perl mysqltuner.pl -host localhost -user mysql -pass xxxx ===> top: -bash-4.2$ free -gh MariaDB [performance_schema]> SELECT ( @@key_buffer_size + @@query_cache_size + @@tmp_table_size + @@innodb_buffer_pool_size + @@innodb_log_buffer_size + More logs etc. can be provided if somebody from MariaDB team investigates it now. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by tawool [ 2018-12-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've been upgrading from 10.1 to 10.2 and I've seen this phenomenon. Fixed by changing to jemalloc. vi /etc/systemd/system/mariadb.service.d/migrated-from-my.cnf-settings.conf systemctl stop mariadb | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2018-12-22 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
tawool, you must've hit | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Frank [ 2018-12-22 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Same issue after upgrading a Galera Cluster from 10.1.37 to 10.2.19 on CentOS 7.5. Adding jemalloc 3.6.0 back on all nodes fixed the issue. Running stable for 2 weeks now. /etc/systemd/system/mariadb.service.d/jemalloc.conf | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by tawool [ 2018-12-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
serg, We do not use tokudb. Only innodb is in use. It was okay before upgrading to 10.2. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Evgenij [ 2018-12-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
serg, thanks for the info, but we don't use TokuDB either. All our servers use only InnoDB. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by YURII KANTONISTOV [ 2018-12-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
"InnoDB only" here as well. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kris Shannon [ 2019-01-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
No TokuDB. Using MariaDB-server-10.2.21-1.el7.centos.x86_64 from http://yum.mariadb.org/10.2/centos7-amd64 Running under CloudLinux 7 with cPanel. We added the systemd LD_PRELOAD=/usr/lib64/libjemalloc.so.1 snippet and the leak seems to have disappeared. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2019-01-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Two thoughts. "we don't use TokuDB" does not necessarily mean that no tokudb is installed. Is it visible in SHOW PLUGINS? Is it present in select * from mysql.plugin? Is the MariaDB-10.2.18-centos73-x86_64-tokudb-engine package installed? If TokuDB is truly not present at all, then... It might be that just the pattern of allocations and deallocations that MariaDB uses (in your application) causes huge memory fragmentation with ptmalloc (that's what glibc uses), and jemalloc, simply by being a different memory allocator doesn't exhibit this behavior. In this case this behavior will be very difficult to reproduce, might be depending on the environment, OS, load, and whatsnot. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Evgenij [ 2019-01-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I don't have TokuDB:
Server use CloudLinux 6 (based CentOS 6). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by YURII KANTONISTOV [ 2019-01-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
No TokuDB in the list of plugins. The worst for our case is that /usr/lib64/libjemalloc.so.1 does not help - VIRT memory grew in a two weeks 20GB=>44GB.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by tawool [ 2019-01-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
"show plugins;" There is no tokudb, and "yum list * tokudb *" shows only the installable version. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by tawool [ 2019-01-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
ykantoni show variables like 'version_malloc_library'; Is the result of the test as shown below? version_malloc_library : jemalloc 3.6.0.0-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by YURII KANTONISTOV [ 2019-01-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by YURII KANTONISTOV [ 2019-02-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Any idea how to fix this and avoid unexpected OOM kill events? **************************************************************** jemalloc 3.6.0-0-g46c0af68bd248b04df75e4f92d5fb804c3d75340 on the machine with 24GB of RAM and 23GB swap:
and 19G given to innodb in my.cnf, no TokuDB:
But mysqld process pid=1952 ate 44+ GB of virtual memory:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Caplan [ 2019-03-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Running into a similar (maybe identical?) issue. MDEV-18866 No TokuDB. Hesitant to try jemalloc (results seem to vary). Issue became much more aggressive when adding Fulltext search to numerous tables. Should my issue be closed in favour of this one? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergey Vojtovich [ 2019-03-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
michaelcaplan, your issue is most probably different: here we have complaints mostly about CentOS and sometimes even no InnoDB. While what you had was uneven allocations distribution between numa nodes and potential memory leak in InnoDB FTS. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Evgenij [ 2019-04-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hey. Has anyone found a solution for themselves other than using jemalloc? Could you tell me if you are using QEMU based virtual machines? Thank you in advance. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2020-12-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Everybody here only complained about 10.2. Could it be that 10.3 is not affected? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by YURII KANTONISTOV [ 2021-01-11 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
> Could it be that 10.3 is not affected? Our DB server was upgraded 10.2.21 => 10.4.17, to me memory consumption pattern looks very much the same. Server version: 10.4.17-MariaDB MariaDB Server
One customer particularly struggles with this issue, MariaDB 10.2.26. They collected the set of mysqld process map, status, list of open files few hours before OOM kill and soon after the autorestart, see attached before_after_oom_kill_maps_lsof_status.zip | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by YURII KANTONISTOV [ 2021-01-11 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Oops, forgot to attach a system log with OOM kill event. |