[MDEV-29097] 10.8.3 seems to be using a lot more swap memory, always increasing (every time mariabackup runs daily) - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Not a Bug
Affects Version/s: 10.8.3
Fix Version/s: N/A
Component/s: Server
Labels:
None

Description

I don't remember having this issue with 10.5, as although it was using a good amount of swap memory, I never had to do an emergency database restart because of it.

Since a few days ago, I had been receiving alerts about too much swap being used on the server, and it's been getting worse. I had to restart MariaDB yesterday, because it was getting full.

Before the restart, this was the usage:

Resident RAM:
mariadbd – 89387556 (85.25 GB)

Swap:
mariadbd – 26975616 (25.73 GB)

And by the way, my server's sysctl has this config:

vm.swappiness=1
(which means to only use Swap if absolutely necessary; while =0 would disable it)

My innodb_buffer_pool_size is 80G.

Are you aware of any reason so much Swap would be used by MariaDB?

There was plenty of resident free RAM that could be used instead.
I understand that some Swap can be used, but I don't understand why so much Swap, instead of resident RAM.

Since the restart yesterday, it's already using 2GB swap, and increasing.

Thank you.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

screenshot-1.png
14 kB
2022-07-27 08:11
screenshot-2.png
18 kB
2022-07-27 08:11
screenshot-3.png
15 kB
2022-07-27 08:12
screenshot-4.png
28 kB
2022-08-13 09:53
screenshot-5.png
88 kB
2022-09-19 10:50
screenshot-6.png
24 kB
2022-09-23 09:04
screenshot-7.png
16 kB
2022-09-23 09:05

Activity

Ascending order - Click to sort in descending order

View 18 older comments

Nuno added a comment - 2022-11-01 23:37

Cheers!

Yeah, based on this link (from one of my previous replies) - https://access.redhat.com/solutions/6785021

They say that the "right/best" thing to do is to start using "CgroupV2".
Just later, in the second link I had sent, they mention about the new sysctl option.

But yeah, I agree that this is a bug/issue with RedHat, and not MariaDB, so I'm happy with you not having to document anything, as it is an OS issue, and quite specific.
Eventually they should make "CgroupV2" the default thing on new versions of RHEL, so...

Thanks!

Nuno added a comment - 2022-11-01 23:37 Cheers! Yeah, based on this link (from one of my previous replies) - https://access.redhat.com/solutions/6785021 They say that the "right/best" thing to do is to start using "CgroupV2". Just later, in the second link I had sent, they mention about the new sysctl option. But yeah, I agree that this is a bug/issue with RedHat, and not MariaDB, so I'm happy with you not having to document anything, as it is an OS issue, and quite specific. Eventually they should make "CgroupV2" the default thing on new versions of RHEL, so... Thanks!

Marko Mäkelä added a comment - 2022-11-02 06:44

nunop:

The strange thing to me is that swap increases a lot while "rsync" is running

Adding some calls to posix_fadvise() could help the Linux kernel to avoid polluting the file system cache with large files that are not going to be accessed any time soon. I encountered https://bugzilla.redhat.com/show_bug.cgi?id=841076 but did not check the current rsync source code.

There might also be an option (some LD_PRELOAD library "shim" similar to libeatmydata.so) that would inject some posix_fadvise() calls at suitable places. Yet another option might be to patch rsync to use O_DIRECT, but that would require all file accesses and memory buffers to be aligned with the underlying physical block size (typically 512 or 4096 bytes).

Marko Mäkelä added a comment - 2022-11-02 06:44 nunop : The strange thing to me is that swap increases a lot while "rsync" is running Adding some calls to posix_fadvise() could help the Linux kernel to avoid polluting the file system cache with large files that are not going to be accessed any time soon. I encountered https://bugzilla.redhat.com/show_bug.cgi?id=841076 but did not check the current rsync source code. There might also be an option (some LD_PRELOAD library "shim" similar to libeatmydata.so ) that would inject some posix_fadvise() calls at suitable places. Yet another option might be to patch rsync to use O_DIRECT , but that would require all file accesses and memory buffers to be aligned with the underlying physical block size (typically 512 or 4096 bytes).

Richard Stracke added a comment - 2023-01-20 13:17

Another idea,

AnonHugePages should work for applications without configuring. (transparent hugepages)

transparent hugepages is enabled by default.
https://access.redhat.com/solutions/46111

intended to bring hugepage support automatically to applications, without requiring custom configuration. Transparent hugepage support works by scanning memory mappings in the background (via the "khugepaged" kernel thread), attempting to find or create (by moving memory around) contiguous 2MB ranges of 4KB mappings, that can be replaced with a single hugepage.

but this can sometimes not work

If an application maps a large range but only touches the first few bytes, it would traditionally consume only a single 4KB page of physical memory. With THP enabled, khugepaged can come and extend that 4KB page into a 2MB page, effectively bloating memory usage by 512x (An example reproducer on this bug report actually demonstrates the 512x worst case!)."

https://blog.nelhage.com/post/transparent-hugepages/

Richard Stracke added a comment - 2023-01-20 13:17 Another idea, AnonHugePages should work for applications without configuring. (transparent hugepages) transparent hugepages is enabled by default. https://access.redhat.com/solutions/46111 intended to bring hugepage support automatically to applications, without requiring custom configuration. Transparent hugepage support works by scanning memory mappings in the background (via the "khugepaged" kernel thread), attempting to find or create (by moving memory around) contiguous 2MB ranges of 4KB mappings, that can be replaced with a single hugepage. but this can sometimes not work If an application maps a large range but only touches the first few bytes, it would traditionally consume only a single 4KB page of physical memory. With THP enabled, khugepaged can come and extend that 4KB page into a 2MB page, effectively bloating memory usage by 512x (An example reproducer on this bug report actually demonstrates the 512x worst case !)." https://blog.nelhage.com/post/transparent-hugepages/

Nuno added a comment - 2023-01-20 16:01

Guys,
This Issue can probably be closed.

Since I'm using vm.force_cgroup_v2_swappiness=1 (added in the latest version of RHEL8 / AlmaLinux 8), this is not longer an issue to me.

It does eventually get to a lot of RAM used, but at least it takes months to get there, rather than once every 1-2 weeks!
But also, I'm likely "overusing" the RAM available anyway (in terms of calculated max possible RAM used), and the server has a lot more than just MariaDB, so it's likely not MariaDB's fault here.

As I said anyway, with the sysctl option above, I'm no longer having this issue anymore, so I'm happy!!

Thank you very much!

Nuno added a comment - 2023-01-20 16:01 Guys, This Issue can probably be closed. Since I'm using vm.force_cgroup_v2_swappiness=1 (added in the latest version of RHEL8 / AlmaLinux 8), this is not longer an issue to me. It does eventually get to a lot of RAM used, but at least it takes months to get there, rather than once every 1-2 weeks! But also, I'm likely "overusing" the RAM available anyway (in terms of calculated max possible RAM used), and the server has a lot more than just MariaDB, so it's likely not MariaDB's fault here. As I said anyway, with the sysctl option above, I'm no longer having this issue anymore, so I'm happy!! Thank you very much!

Marko Mäkelä added a comment - 2023-01-20 16:30

nunop, this ticket has already been closed as "not a (MariaDB) bug". Thank you for your update.

Marko Mäkelä added a comment - 2023-01-20 16:30 nunop , this ticket has already been closed as "not a (MariaDB) bug". Thank you for your update.

MariaDB Server

10.8.3 seems to be using a lot more swap memory, always increasing (every time mariabackup runs daily)

Details

Description

Attachments

Attachments

Activity

People

Dates

Git Integration