[MDEV-11309] MariaDB server restarts every night (at least, sometimes more) for no apparent reason Created: 2016-11-18  Updated: 2017-01-16  Resolved: 2017-01-16

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.1.19
Fix Version/s: 10.1.21

Type: Bug Priority: Major
Reporter: Rafael Gallastegui Assignee: Sergei Golubchik
Resolution: Fixed Votes: 0
Labels: None
Environment:

CentOS 6.8 VM 16GB memory.


Attachments: File my.cnf     File oom.txt.gz    
Issue Links:
Relates
relates to MDEV-9764 MariaDB does not limit memory used fo... Closed

 Description   

Every night (sometimes even during the day), we get a restart of the MariaDB server (Master in a master/slave replication pair). Only messages in the log are:

161117 22:46:39 mysqld_safe Number of processes running now: 0
161117 22:46:39 mysqld_safe mysqld restarted
2016-11-17 22:46:40 139737201940512 [Note] /usr/sbin/mysqld (mysqld 10.1.19-MariaDB) starting as process 10589 ...
2016-11-17 22:46:41 139737201940512 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-11-17 22:46:41 139737201940512 [Note] InnoDB: The InnoDB memory heap is disabled

followed by the usual start sequence. Nothing before it that it indicates errors or so



 Comments   
Comment by Elena Stepanova [ 2016-11-18 ]

Please check you system logs for messages from OOM killer. If mysqld process disappears without saying anything at all in the log, it's almost always caused by OOM.

Comment by Rafael Gallastegui [ 2016-11-20 ]

Elena,

Thanks for the quick response. That was the issue. I'm investigating what settings to change to reduce the memory usage. I guess this JIRA can be closed based on that

Comment by Elena Stepanova [ 2016-11-20 ]

Thanks for the update. I'm closing it for now, but please comment to re-open if during your investigation you find out that MariaDB might have a memory leak.

Comment by Rafael Gallastegui [ 2016-12-01 ]

Hi,

I'm attaching the output of a tool that does the following every minute:

OUT_FILE=/root/oom.out
 
/bin/date >> ${OUT_FILE}
/usr/bin/free >> ${OUT_FILE}
ps aux | grep mysqld | grep -v grep >> ${OUT_FILE}
mysql -uroot -p<password> -e "SHOW FULL PROCESSLIST" >> ${OUT_FILE}
mysql -uroot -p<password> -e "SHOW ENGINE INNODB STATUS\G" >> ${OUT_FILE}

The file contains the data in between 2 crashes due to OOM

Reducing the innodb_buffer_pool_size from 12G to 10G helped a little bit but I'm still running out of memory
I see that the process starts using memory to the point that "ps aux" reports a memory footprint of the process larger than the main memory size (16GB for our server)
We have about 221354 tables in the server, most of them being used by 3 warehouse databases (and keeps on growing)
I have modified the /etc/security/limits.conf file to add these lines:

# For MariaDB
*               soft    nofile          1024000
*               hard    nofile          1024000

(I figure that our problem might be related to the number of tables, that's why I mention all those changes)

This server is the master to a rep server that has 32GB in memory (I know that replication runs into issues, but was waiting for the master to stabilize. Don't know if the rep process of the slave might affect the server).

Thanks for the help!

Comment by Rafael Gallastegui [ 2016-12-05 ]

Another clue:

  • I have managed to reduce the frequency of OOM crashes by running every 30 mins "FLUSH TABLES"
Comment by Elena Stepanova [ 2016-12-05 ]

In this case, maybe you could try reducing table_open_cache which is currently set to a quite big value of 150000.

Comment by Rafael Gallastegui [ 2016-12-06 ]

I have reduced the table_open_cache to 50000 and have been monitoring the system since yesterday.

Is there a way to calculate ahead of time how much memory will MariaDB use as a maximum? I mean, it's kind of hard to know what value to use for some parameters, including table_open_cache, just by trial and error, especially in a production environment. table_open_cache could take days to crawl up until the system runs out of memory. I guess I'm concerned that I report to my management that the issue has been solved and find out later on that I get an OOM because another parameter pushed MariaDB to use more memory.

Comment by Rafael Gallastegui [ 2016-12-06 ]

By the way, we appreciate the time you've spent helping us!

Comment by Sergei Golubchik [ 2017-01-16 ]

In MDEV-9764 we've now implemented a way to limit MariaDB memory usage per connection. Together with other limits (for global buffers) and a max number of connections limit, this allows you to limit all memory used by MariaDB Server.

Generated at Thu Feb 08 07:48:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.