[MDEV-18866] Chewing Through Swap Created: 2019-03-09  Updated: 2022-07-08

Status: Open
Project: MariaDB Server
Component/s: OTHER
Affects Version/s: 10.2.14
Fix Version/s: 10.2

Type: Bug Priority: Major
Reporter: Michael Caplan Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Linux 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


Attachments: PNG File Selection_144.png     PNG File Selection_145.png     PNG File available-mem.png     PNG File feeding-swap.png     Text File globalstatus.txt     Text File globalvars.txt     Text File innodbstatus.txt     File my.cnf     PNG File swap.png    
Issue Links:
Relates
relates to MDEV-6319 MariaDB 10.x gets killed by CentOS oo... Closed
relates to MDEV-15344 Huge memory usage on Maria 10.2.x cPanel Open

 Description   

I'm running 10.2.14, with roughly 300GB data (1000K +/- tables). 95% tables are innodb. I have 64GB RAM, with INNODB buffer pool size set to 46GB (full my.cnf attached). The OS is Ubuntu 16.04.4. This is a dedicated MariaDB server.

I have swappiness set to 0.

I'm suffering from MariaDB 10.2.14 ripping threw allocated memory and overflowing into swap. innodb_numa_interleave=1 did not solve the problem. Actually, the problem has become even more aggressive recently. Ever since enabling full text search on thousands of tables, MariaDB has become way more aggressive in over consuming memory. I'm faced with either hitting OOM Killer, feeding it by creating more swap, or restarting MariaDB every other day. Of course, all three scenarios are unsustainable.

Not sure if this is related to to indicated issues, or where to take this issue from here.

As you can see in on Mar 7, I grab the following numa maps summary to see how unbalanced it might be when swap was eaten up:

N0        :      7861090 ( 29.99 GB)
N1        :      5629622 ( 21.48 GB)
active    :     10546055 ( 40.23 GB)
anon      :     13441950 ( 51.28 GB)
dirty     :     13176616 ( 50.26 GB)
kernelpagesize_kB:         3628 (  0.01 GB)
mapmax    :          244 (  0.00 GB)
mapped    :        66876 (  0.26 GB)
swapcache :       265334 (  1.01 GB)

Thanks,

Mike



 Comments   
Comment by Robert Bindar [ 2019-03-13 ]

Hi michaelcaplan!

Is the numa_maps summary taken before or after the innodb_numa_interleave option was set?
If it is before, do you have any summary taken after the option was set?

Thanks,
Robert

Comment by Michael Caplan [ 2019-03-13 ]

Hi Robert,

That would be after.

Comment by Robert Bindar [ 2019-03-13 ]

Excellent! And just to be sure, the summary is taken after you enabled full text search on the tables?

Can you try running the server using numactl and get a numa_maps dump after that? ( numactrl --interleave all mysqld)
It may fix the problem.
Sergey suggested this as a solution for interleaving the entire memory allocated by the server process. It might be the case that InnoDB's policy might only interleave the memory allocated for the buffer pool.

Comment by Sergey Vojtovich [ 2019-03-13 ]

Note that it is supported by mysqld_safe --numa-interleave if you're using it. There should be an example in the MariaDB systemd service file as well.

Comment by Michael Caplan [ 2019-03-13 ]

Thanks folks.

Yes, the summary was taken after enabling fulltext search.

To be sure, looking at this knowledge base article, the recommendation is then in a drop-in .conf file, add redefine ExecStart as follows:

[Service]
 
ExecStart=/usr/bin/numactl --interleave=all /usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION

Comment by Robert Bindar [ 2019-03-14 ]

Either that or what Sergey and I proposed above should do, make sure you get a numa maps dump and post it here, thanks a lot Michael.

Comment by Robert Bindar [ 2019-03-25 ]

Hi michaelcaplan! Any news on this? Did any of the above tricks solve your issue?

Comment by Sergey Vojtovich [ 2019-04-30 ]

Closing as no feedback provided since over a month. Feel free to reopen if the problem is still reproducible.

Comment by Michael Caplan [ 2019-04-30 ]

Sorry for not providing updates on this in some time. Just over a month ago, we dumped all our FTS index. This radically stabilised our swap issue. This said, we are not in the clear.

I have two identical servers (one master, one slave). Slave is running with numactl, master is not.

Slave running with numactl was behaving quite well up till around a week ago. Not sure what triggered the daily dipping into swap:

Master running without numactl has been steadily running threw swap.

Both do have `innodb_numa_interleave=1 ` set.

Generated at Thu Feb 08 08:47:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.