Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36197

Implement Buffer Pool Auto-Scaling Based on RAM Availability

Details

    Description

      Description
      Enable the buffer pool to adjust innodb_buffer_pool_size based on available RAM, targeting 75-80% by default, with configurable scaling factors and limits for stability. See https://jira.mariadb.org/browse/MDEV-36194 for more information on use case.

      Tasks

      1. Implement periodic RAM detection (e.g., via syscalls or OS notifications).
        Add system variables:
      2. buffer_pool_auto_scale_method= {OFF, RAM_percentage}

        (default: OFF).

      3. buffer_pool_auto_scale_factor=N (default: 0.75, range: 0.1–1.0).
      4. buffer_pool_min_size (default: 128MB), buffer_pool_max_size (default: unlimited).
      5. Set innodb_buffer_pool_size = RAM * buffer_pool_auto_scale_factor, respecting limits.
      6. Reject manual SET GLOBAL innodb_buffer_pool_size when auto-scaling is active.
      7. Make polling frequency configurable (e.g., once per second, per minute), defaulting to once per minute.

      Acceptance Criteria

      1. buffer_pool_auto_scale_method=RAM_percentage activates RAM-based scaling.
      2. Example: 16GB RAM, buffer_pool_auto_scale_factor=0.75, innodb_buffer_pool_size=12GB.
      3. buffer_pool_auto_scale_factor configurable between 0.1–1.0.
      4. Respects buffer_pool_min_size and buffer_pool_max_size.
      5. Manual SET GLOBAL innodb_buffer_pool_size fails with an error.
      6. Ensure the system variables are manually configurable.

      Attachments

        Issue Links

          Activity

            I think that this depends on MDEV-29445, which is currently under review. The current implementation includes a fix of MDEV-34863, which is revising the handling of Linux memory pressure events. With that revision, that interface would be disabled by default. There would be a parameter that would specify the minimum innodb_buffer_pool_size that a memory pressure event could shrink the buffer pool to, and no mechanism for setting the buffer pool size back.

            marko Marko Mäkelä added a comment - I think that this depends on MDEV-29445 , which is currently under review . The current implementation includes a fix of MDEV-34863 , which is revising the handling of Linux memory pressure events. With that revision, that interface would be disabled by default. There would be a parameter that would specify the minimum innodb_buffer_pool_size that a memory pressure event could shrink the buffer pool to, and no mechanism for setting the buffer pool size back.

            When it comes to the parameters, I think that in addition to the innodb_buffer_pool_size_max that is part of MDEV-29445 and the "minimum" that is part of MDEV-34863, we would need something that would enable the "autoextend" or specify a maximum size for it. I think that we would also want the InnoDB buf_pool.LRU or buf_pool.free to be part of the "autoextend" decision. If buf_LRU_get_free_block() never needs to wait, then maybe we should not increase the current innodb_buffer_pool_size.

            danblack, do you have any idea if the memory pressure events could be monitored to notice when there is very little pressure and it would make sense to increase the buffer pool?

            marko Marko Mäkelä added a comment - When it comes to the parameters, I think that in addition to the innodb_buffer_pool_size_max that is part of MDEV-29445 and the "minimum" that is part of MDEV-34863 , we would need something that would enable the "autoextend" or specify a maximum size for it. I think that we would also want the InnoDB buf_pool.LRU or buf_pool.free to be part of the "autoextend" decision. If buf_LRU_get_free_block() never needs to wait, then maybe we should not increase the current innodb_buffer_pool_size . danblack , do you have any idea if the memory pressure events could be monitored to notice when there is very little pressure and it would make sense to increase the buffer pool?
            danblack Daniel Black added a comment -

            Everything I've seen the very little/no pressure state needs to decided internally and the kernel PSI interface provides and event only when there is an pressure stall event.

            The point at which there are no free pages seems like the good place to check the last event time, and if past a threshold and claim some more pages out of the OS. I haven't looked for good systems equilibrium algorithms that would play well here.

            danblack Daniel Black added a comment - Everything I've seen the very little/no pressure state needs to decided internally and the kernel PSI interface provides and event only when there is an pressure stall event. The point at which there are no free pages seems like the good place to check the last event time, and if past a threshold and claim some more pages out of the OS. I haven't looked for good systems equilibrium algorithms that would play well here.
            marko Marko Mäkelä added a comment -

            adamluciano, I have been thinking of the following parameters related to this:

            • innodb_buffer_pool_size reflects the current buffer pool size. MDEV-29445 is reimplementing buffer pool size changes, fixing the crash MDEV-35485 and preventing hangs when the buffer pool size is shrunk.
            • innodb_buffer_pool_size_max would be a read-only startup parameter introduced in MDEV-29445 that specifies the maximum innodb_buffer_pool_size. This is necessary, because the current implementation in MDEV-29445 would allocate a single contiguous virtual address range for the buffer pool, instead of allocating multiple virtual address ranges.
            • innodb_buffer_pool_size_auto_min (MDEV-34863 would introduce this on Linux; here we would extend it to all platforms) sets the minimum buffer pool size for automatic reduction (default: innodb_buffer_pool_size_max to disable the logic).
            • innodb_buffer_pool_size_auto_max (my proposal for this task) would set the maximum for the automation (default: 0 to disable the logic).

            The existence of a read-only parameter innodb_buffer_pool_size_max may conflict with what you had in mind related to the suggested buffer_pool_auto_scale_factor. Because we make frequent use of heap allocation (this could be reduced in MDEV-14602 and elsewhere), the total memory usage of the server is rather unpredictable. Also, if there were multiple containers running in the same system, if allowing unlimited memory usage in each container would seem to generate unpredictable conflicts between dynamically configured containers. I think that by forcing innodb_buffer_pool_size_max to be specified at startup, we would proactively prevent some trouble. I realize that Linux supports memory hot-plugging, so the available amount of RAM could change while the server is running. If we claimed support for that, this would have to be tested regularly by us, not in the field.

            The dedicated buf_flush_page_cleaner() thread, which can by design observe when it would make sense to extend the buffer pool, is normally invoked once per second, or whenever another thread has run out of buffer pool or space in the circular ib_logfile0. https://smalldatum.blogspot.com/2022/10/reasons-for-writeback-with-update-in.html applies here; additionally, if we are lucky, some blocks can be evicted without writing them back.

            I can think of the following triggering events for shrinking or extending:

            • Shrink the buffer pool on a memory pressure event from Linux (a last resort to prevent an out-of-memory kill by the Linux kernel)
            • Shrink after some specified time, if there have been few requests on buffer pool pages, while preserving some most recently accessed pages
            • Extend if buf_flush_page_cleaner() notices that we are frequently trying to evict least recently used pages
            • Extend more aggressively if buf_flush_LRU() is called frequently (the evicted pages need to be written out first)

            Most of these can be based on discrete events rather than time; only the ‘preemptive’ or ‘voluntary’ shrinking would seem to require a parameter for polling frequency or a timeout for when the server is considered to be idle enough. We’d also need a parameter or two for controlling when to extend the buffer pool. I’m looking forward to specific suggestions that are compatible with the existing logic.

            Do we need a parameter to control by how much to shrink or extend at a time? I think that we could do without one. In MDEV-34863 I would shrink halfway between the current innodb_buffer_pool_size and the specified innodb_buffer_pool_size_auto_min. We could use similar logic for extending. The automatic size changes would follow a curve with a negative exponent, a bit like ½, ¼, ⅛, …, (½)ⁿ. To give an example, if the minimum and maximum limits were 256MiB and 2048MiB and the current innodb_buffer_pool_size were 1024MiB, the first step would shrink to ½(256+1024)MiB (-384MiB) or extend to ½(2048+1024)MiB (+512MiB). The closer we get to the limit, the smaller the adjustment, until we reach the minimum adjustment size (8MiB with MDEV-29445). Would this be adequate?

            marko Marko Mäkelä added a comment - adamluciano , I have been thinking of the following parameters related to this: innodb_buffer_pool_size reflects the current buffer pool size. MDEV-29445 is reimplementing buffer pool size changes, fixing the crash MDEV-35485 and preventing hangs when the buffer pool size is shrunk. innodb_buffer_pool_size_max would be a read-only startup parameter introduced in MDEV-29445 that specifies the maximum innodb_buffer_pool_size . This is necessary, because the current implementation in MDEV-29445 would allocate a single contiguous virtual address range for the buffer pool, instead of allocating multiple virtual address ranges. innodb_buffer_pool_size_auto_min ( MDEV-34863 would introduce this on Linux; here we would extend it to all platforms) sets the minimum buffer pool size for automatic reduction (default: innodb_buffer_pool_size_max to disable the logic). innodb_buffer_pool_size_auto_max (my proposal for this task) would set the maximum for the automation (default: 0 to disable the logic). The existence of a read-only parameter innodb_buffer_pool_size_max may conflict with what you had in mind related to the suggested buffer_pool_auto_scale_factor . Because we make frequent use of heap allocation (this could be reduced in MDEV-14602 and elsewhere), the total memory usage of the server is rather unpredictable. Also, if there were multiple containers running in the same system, if allowing unlimited memory usage in each container would seem to generate unpredictable conflicts between dynamically configured containers. I think that by forcing innodb_buffer_pool_size_max to be specified at startup, we would proactively prevent some trouble. I realize that Linux supports memory hot-plugging , so the available amount of RAM could change while the server is running. If we claimed support for that, this would have to be tested regularly by us, not in the field. The dedicated buf_flush_page_cleaner() thread, which can by design observe when it would make sense to extend the buffer pool, is normally invoked once per second, or whenever another thread has run out of buffer pool or space in the circular ib_logfile0 . https://smalldatum.blogspot.com/2022/10/reasons-for-writeback-with-update-in.html applies here; additionally, if we are lucky, some blocks can be evicted without writing them back. I can think of the following triggering events for shrinking or extending: Shrink the buffer pool on a memory pressure event from Linux (a last resort to prevent an out-of-memory kill by the Linux kernel) Shrink after some specified time , if there have been few requests on buffer pool pages, while preserving some most recently accessed pages Extend if buf_flush_page_cleaner() notices that we are frequently trying to evict least recently used pages Extend more aggressively if buf_flush_LRU() is called frequently (the evicted pages need to be written out first) Most of these can be based on discrete events rather than time; only the ‘preemptive’ or ‘voluntary’ shrinking would seem to require a parameter for polling frequency or a timeout for when the server is considered to be idle enough. We’d also need a parameter or two for controlling when to extend the buffer pool. I’m looking forward to specific suggestions that are compatible with the existing logic. Do we need a parameter to control by how much to shrink or extend at a time? I think that we could do without one. In MDEV-34863 I would shrink halfway between the current innodb_buffer_pool_size and the specified innodb_buffer_pool_size_auto_min . We could use similar logic for extending. The automatic size changes would follow a curve with a negative exponent, a bit like ½, ¼, ⅛, …, (½)ⁿ. To give an example, if the minimum and maximum limits were 256MiB and 2048MiB and the current innodb_buffer_pool_size were 1024MiB, the first step would shrink to ½(256+1024)MiB (-384MiB) or extend to ½(2048+1024)MiB (+512MiB). The closer we get to the limit, the smaller the adjustment, until we reach the minimum adjustment size (8MiB with MDEV-29445 ). Would this be adequate?
            adamluciano Adam Luciano added a comment -

            Thanks for your detailed input - I really appreciate the depth of your analysis. I agree with your points and I think your proposed parameters (innodb_buffer_pool_size_auto_min and innodb_buffer_pool_size_auto_max) fit perfectly with the story objectives. Your suggestion to use discrete events like memory pressure or page changes makes sense, they ultimately enable a more responsive system.

            I also like your stepwise adjustment strategy with an exponent curve. It reminds me of the hill-climbing strategy outlined in MENT-2237, where incremental steps guide the system toward an optimal state.

            The question I have is - do we need both a time event and a triggering event, specfically for shrinking? In the future if we are to have a use case such as using the server in a serverless setup in the cloud, using time as a mechanism to potentially shrink the buffer pool could be something that can be valuable from a cost perspective for our customers as it would relate to sensing usage of the system and could be used also to potentially turn the instance off assuming no active connections - this could also be overly complex, but wanted to pose the question - what do you think along that angle?

            adamluciano Adam Luciano added a comment - Thanks for your detailed input - I really appreciate the depth of your analysis. I agree with your points and I think your proposed parameters (innodb_buffer_pool_size_auto_min and innodb_buffer_pool_size_auto_max) fit perfectly with the story objectives. Your suggestion to use discrete events like memory pressure or page changes makes sense, they ultimately enable a more responsive system. I also like your stepwise adjustment strategy with an exponent curve. It reminds me of the hill-climbing strategy outlined in MENT-2237 , where incremental steps guide the system toward an optimal state. The question I have is - do we need both a time event and a triggering event, specfically for shrinking? In the future if we are to have a use case such as using the server in a serverless setup in the cloud, using time as a mechanism to potentially shrink the buffer pool could be something that can be valuable from a cost perspective for our customers as it would relate to sensing usage of the system and could be used also to potentially turn the instance off assuming no active connections - this could also be overly complex, but wanted to pose the question - what do you think along that angle?

            People

              marko Marko Mäkelä
              adamluciano Adam Luciano
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.