Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29097

10.8.3 seems to be using a lot more swap memory, always increasing (every time mariabackup runs daily)

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Not a Bug
    • 10.8.3
    • N/A
    • Server
    • None

    Description

      I don't remember having this issue with 10.5, as although it was using a good amount of swap memory, I never had to do an emergency database restart because of it.

      Since a few days ago, I had been receiving alerts about too much swap being used on the server, and it's been getting worse. I had to restart MariaDB yesterday, because it was getting full.

      Before the restart, this was the usage:

      Resident RAM:
      mariadbd – 89387556 (85.25 GB)

      Swap:
      mariadbd – 26975616 (25.73 GB)

      And by the way, my server's sysctl has this config:

      vm.swappiness=1
      (which means to only use Swap if absolutely necessary; while =0 would disable it)

      My innodb_buffer_pool_size is 80G.

      Are you aware of any reason so much Swap would be used by MariaDB?

      There was plenty of resident free RAM that could be used instead.
      I understand that some Swap can be used, but I don't understand why so much Swap, instead of resident RAM.

      Since the restart yesterday, it's already using 2GB swap, and increasing.

      Thank you.

      Attachments

        1. screenshot-1.png
          screenshot-1.png
          14 kB
        2. screenshot-2.png
          screenshot-2.png
          18 kB
        3. screenshot-3.png
          screenshot-3.png
          15 kB
        4. screenshot-4.png
          screenshot-4.png
          28 kB
        5. screenshot-5.png
          screenshot-5.png
          88 kB
        6. screenshot-6.png
          screenshot-6.png
          24 kB
        7. screenshot-7.png
          screenshot-7.png
          16 kB

        Activity

          nunop Nuno added a comment -

          Cheers!

          Yeah, based on this link (from one of my previous replies) - https://access.redhat.com/solutions/6785021

          They say that the "right/best" thing to do is to start using "CgroupV2".
          Just later, in the second link I had sent, they mention about the new sysctl option.

          But yeah, I agree that this is a bug/issue with RedHat, and not MariaDB, so I'm happy with you not having to document anything, as it is an OS issue, and quite specific.
          Eventually they should make "CgroupV2" the default thing on new versions of RHEL, so...

          Thanks!

          nunop Nuno added a comment - Cheers! Yeah, based on this link (from one of my previous replies) - https://access.redhat.com/solutions/6785021 They say that the "right/best" thing to do is to start using "CgroupV2". Just later, in the second link I had sent, they mention about the new sysctl option. But yeah, I agree that this is a bug/issue with RedHat, and not MariaDB, so I'm happy with you not having to document anything, as it is an OS issue, and quite specific. Eventually they should make "CgroupV2" the default thing on new versions of RHEL, so... Thanks!

          nunop:

          The strange thing to me is that swap increases a lot while "rsync" is running

          Adding some calls to posix_fadvise() could help the Linux kernel to avoid polluting the file system cache with large files that are not going to be accessed any time soon. I encountered https://bugzilla.redhat.com/show_bug.cgi?id=841076 but did not check the current rsync source code.

          There might also be an option (some LD_PRELOAD library "shim" similar to libeatmydata.so) that would inject some posix_fadvise() calls at suitable places. Yet another option might be to patch rsync to use O_DIRECT, but that would require all file accesses and memory buffers to be aligned with the underlying physical block size (typically 512 or 4096 bytes).

          marko Marko Mäkelä added a comment - nunop : The strange thing to me is that swap increases a lot while "rsync" is running Adding some calls to posix_fadvise() could help the Linux kernel to avoid polluting the file system cache with large files that are not going to be accessed any time soon. I encountered https://bugzilla.redhat.com/show_bug.cgi?id=841076 but did not check the current rsync source code. There might also be an option (some LD_PRELOAD library "shim" similar to libeatmydata.so ) that would inject some posix_fadvise() calls at suitable places. Yet another option might be to patch rsync to use O_DIRECT , but that would require all file accesses and memory buffers to be aligned with the underlying physical block size (typically 512 or 4096 bytes).

          Another idea,

          AnonHugePages should work for applications without configuring. (transparent hugepages)

          transparent hugepages is enabled by default.
          https://access.redhat.com/solutions/46111

          intended to bring hugepage support automatically to applications, without requiring custom configuration. Transparent hugepage support works by scanning memory mappings in the background (via the "khugepaged" kernel thread), attempting to find or create (by moving memory around) contiguous 2MB ranges of 4KB mappings, that can be replaced with a single hugepage.
          

          but this can sometimes not work

          If an application maps a large range but only touches the first few bytes, it would traditionally consume only a single 4KB page of physical memory. With THP enabled, khugepaged can come and extend that 4KB page into a 2MB page, effectively bloating memory usage by 512x (An example reproducer on this bug report actually demonstrates the 512x worst case!)."
          

          https://blog.nelhage.com/post/transparent-hugepages/

          Richard Richard Stracke added a comment - Another idea, AnonHugePages should work for applications without configuring. (transparent hugepages) transparent hugepages is enabled by default. https://access.redhat.com/solutions/46111 intended to bring hugepage support automatically to applications, without requiring custom configuration. Transparent hugepage support works by scanning memory mappings in the background (via the "khugepaged" kernel thread), attempting to find or create (by moving memory around) contiguous 2MB ranges of 4KB mappings, that can be replaced with a single hugepage. but this can sometimes not work If an application maps a large range but only touches the first few bytes, it would traditionally consume only a single 4KB page of physical memory. With THP enabled, khugepaged can come and extend that 4KB page into a 2MB page, effectively bloating memory usage by 512x (An example reproducer on this bug report actually demonstrates the 512x worst case !)." https://blog.nelhage.com/post/transparent-hugepages/
          nunop Nuno added a comment -

          Guys,
          This Issue can probably be closed.

          Since I'm using vm.force_cgroup_v2_swappiness=1 (added in the latest version of RHEL8 / AlmaLinux 8), this is not longer an issue to me.

          It does eventually get to a lot of RAM used, but at least it takes months to get there, rather than once every 1-2 weeks!
          But also, I'm likely "overusing" the RAM available anyway (in terms of calculated max possible RAM used), and the server has a lot more than just MariaDB, so it's likely not MariaDB's fault here.

          As I said anyway, with the sysctl option above, I'm no longer having this issue anymore, so I'm happy!!

          Thank you very much!

          nunop Nuno added a comment - Guys, This Issue can probably be closed. Since I'm using vm.force_cgroup_v2_swappiness=1 (added in the latest version of RHEL8 / AlmaLinux 8), this is not longer an issue to me. It does eventually get to a lot of RAM used, but at least it takes months to get there, rather than once every 1-2 weeks! But also, I'm likely "overusing" the RAM available anyway (in terms of calculated max possible RAM used), and the server has a lot more than just MariaDB, so it's likely not MariaDB's fault here. As I said anyway, with the sysctl option above, I'm no longer having this issue anymore, so I'm happy!! Thank you very much!

          nunop, this ticket has already been closed as "not a (MariaDB) bug". Thank you for your update.

          marko Marko Mäkelä added a comment - nunop , this ticket has already been closed as "not a (MariaDB) bug". Thank you for your update.

          People

            danblack Daniel Black
            nunop Nuno
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.