Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31953

madvise(frame, srv_page_size, MADV_FREE) is causing a performance regression

Details

    Description

      In MDEV-25341 (MariaDB Server 10.11.0), some calls to madvise() with MADV_FREE were added.

      If the call in buf_LRU_block_free_non_file_page() is committed out, the throughput in an I/O bound HammerDB benchmark would improve by 6% to 18%, depending on the CPU microarchitecture.

      It could be best to revert this code for now. A better solution might be doable as part of a larger refactoring (MDEV-25340, MDEV-29445). Instead of invoking madvise(MADV_FREE) on individual buffer pages, we might invoke it on larger chunks at a time. There could also be an explicit SQL statement for defragmenting or freeing the buffer pool.

      Attachments

        Issue Links

          Activity

            wlad, are you proposing to essentially revert the new 10.11 feature MDEV-25341 entirely without any replacement?

            I believe that SET GLOBAL already requires a privileged user. The new behaviour of setting the parameter to the old value can be documented. The old behaviour is not that lightweight either: it would always spawn buf_resize_task and wait for it to complete.

            marko Marko Mäkelä added a comment - wlad , are you proposing to essentially revert the new 10.11 feature MDEV-25341 entirely without any replacement? I believe that SET GLOBAL already requires a privileged user. The new behaviour of setting the parameter to the old value can be documented. The old behaviour is not that lightweight either: it would always spawn buf_resize_task and wait for it to complete.

            marko, revert without replacement, yep, that's the proposal. If we determine that a feature was determined to be not good enough, performance-wise or otherwise, why not? It does not have visible interface. The server returns to being as impolite as any other program. Maybe Linux kernel would learn to shrink process working set, instead of killing processes one day.

            wlad Vladislav Vaintroub added a comment - marko , revert without replacement, yep, that's the proposal. If we determine that a feature was determined to be not good enough, performance-wise or otherwise, why not? It does not have visible interface. The server returns to being as impolite as any other program. Maybe Linux kernel would learn to shrink process working set, instead of killing processes one day.

            danblack suggested a better solution: Make MDEV-24670 an InnoDB bug fix, so that the buf_pool_t::garbage_collect() that I added would be automatically invoked when some memory pressure is reported by the operating system. This could avoid an OOM kill without actually changing any virtual address mappings for the buffer pool.

            Let us do that, and leave the logic around SET GLOBAL innodb_buffer_pool_size alone. This will miss the scheduled quarterly releases, but will hopefully be finished in the next quarter.

            marko Marko Mäkelä added a comment - danblack suggested a better solution: Make MDEV-24670 an InnoDB bug fix, so that the buf_pool_t::garbage_collect() that I added would be automatically invoked when some memory pressure is reported by the operating system. This could avoid an OOM kill without actually changing any virtual address mappings for the buffer pool. Let us do that, and leave the logic around SET GLOBAL innodb_buffer_pool_size alone. This will miss the scheduled quarterly releases, but will hopefully be finished in the next quarter.
            danblack Daniel Black added a comment -

            Needs advice on improving reliability of the test case, but otherwise complete.

            (Hope I didn't miss anything).

            danblack Daniel Black added a comment - Needs advice on improving reliability of the test case, but otherwise complete. (Hope I didn't miss anything).

            OK to push after addressing my review comment, which includes a more stable version of the test innodb.mem_pressure.

            marko Marko Mäkelä added a comment - OK to push after addressing my review comment, which includes a more stable version of the test innodb.mem_pressure .

            People

              danblack Daniel Black
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.