Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15016

multiple page cleaner threads use a lot of CPU on idle server

Details

    • 10.1.31

    Description

      Issue described in https://lists.launchpad.net/maria-discuss/msg04975.html as follows
      "
      On a 4-core (4 physical, 8 logical cores), 3.5 Ghz box, MariaDB 10.3.4
      consumes typically 35-40% of available CPU even when completely idle
      according to Task Manager. Please see image.

      I cannot tell if previous 10.3 versions also did, as I don't have 10.3
      running constantly in the backgorund. I just observed this now. This does
      not happen with MariaDB 10.1, MariaDB 10.2, MySQL 5.6, MySQL 5.7 and MySQL
      8.0 (all those I have starting with Windows on this box), where the
      CPU-load is neglible unless the server is "doing real work". As a result, the CPU clock frequency is always highest possible 3.5 Ghz
      (this CPU model is a 4th generation Intel with the option to scale down CPU
      frequency to around 40-50% of max. when there is no need for more) when
      MariaDB 10.3.4 is running.

      "

      Examination of the attached stacktraces taken from the process when it in this "idle" state, shows 3 CPUs being occupied by something, that for most time seems to be some busy-look acquiring innodb mutex
      .
      my.ini uses all defaults, with the exception of
      innodb_buffer_pool_size=4091M

      Setting innodb_page_cleaners = 1, according to reporter, fixed the issue.

      Relevant callstack information :

       
      . 15  Id: 2ce8.2d38 Suspend: 0 Teb: 0000005f`55863000 Unfrozen
            Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
            Priority: 0  Priority class: 32  Affinity: ff
      Child-SP          RetAddr           Call Site
      0000005f`56cffd40 00007ff7`6aec945d ntdll!RtlLeaveCriticalSection+0x29
      0000005f`56cffd70 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xed [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
      0000005f`56cffdb0 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
      0000005f`56cffde0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
       
      . 16  Id: 2ce8.2d3c Suspend: 0 Teb: 0000005f`55865000 Unfrozen
            Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
            Priority: 0  Priority class: 32  Affinity: ff
      Child-SP          RetAddr           Call Site
      0000005f`56dff6c0 00007ff7`6aec8503 mysqld!PolicyMutex<TTASEventMutex<GenericPolicy> >::enter+0x26 [d:\winx64-packages\build\src\storage\innobase\include\ib0mutex.h @ 622]
      0000005f`56dff740 00007ff7`6aec944b mysqld!pc_flush_slot+0x43 [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 2841]
      0000005f`56dff7b0 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xdb [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
      0000005f`56dff7f0 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
      0000005f`56dff820 00000000`00000000 ntdll!RtlUserThreadStart+0x21
       
      . 17  Id: 2ce8.2d40 Suspend: 0 Teb: 0000005f`55867000 Unfrozen
            Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
            Priority: 0  Priority class: 32  Affinity: ff
      Child-SP          RetAddr           Call Site
      0000005f`56eff750 00007ff7`6aec8503 mysqld!PolicyMutex<TTASEventMutex<GenericPolicy> >::enter+0x57 [d:\winx64-packages\build\src\storage\innobase\include\ib0mutex.h @ 635]
      0000005f`56eff7d0 00007ff7`6aec944b mysqld!pc_flush_slot+0x43 [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 2841]
      0000005f`56eff840 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xdb [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
      0000005f`56eff880 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
      0000005f`56eff8b0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
      

      . 18  Id: c10.1c30 Suspend: 0 Teb: 00000012`c5100000 Unfrozen
            Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
            Priority: 0  Priority class: 32  Affinity: ff
      Child-SP          RetAddr           Call Site
      00000012`c63ff970 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xe8 [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
      00000012`c63ff9b0 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
      00000012`c63ff9e0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
       
      . 19  Id: c10.a04 Suspend: 0 Teb: 00000012`c5102000 Unfrozen
            Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
            Priority: 0  Priority class: 32  Affinity: ff
      Child-SP          RetAddr           Call Site
      00000012`c64ff6d0 00007ff7`6ae0ef0f ntdll!RtlEnterCriticalSection+0x13
      (Inline Function) --------`-------- mysqld!OSMutex::enter+0xa [d:\winx64-packages\build\src\storage\innobase\include\sync0types.h @ 468]
      00000012`c64ff700 00007ff7`6aec945d mysqld!os_event::wait_low+0x1f [d:\winx64-packages\build\src\storage\innobase\os\os0event.cc @ 330]
      00000012`c64ff730 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xed [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
      00000012`c64ff770 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
      00000012`c64ff7a0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
       
      . 20  Id: c10.161c Suspend: 0 Teb: 00000012`c5104000 Unfrozen
            Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
            Priority: 0  Priority class: 32  Affinity: ff
      Child-SP          RetAddr           Call Site
      00000012`c65ff9e8 00007ffa`6feec17f ntdll!NtWaitForAlertByThreadId+0x14
      00000012`c65ff9f0 00007ffa`6feebf4b ntdll!RtlpWaitOnAddressWithTimeout+0x43
      00000012`c65ffa20 00007ffa`6fec0c49 ntdll!RtlpWaitOnCriticalSection+0x15f
      00000012`c65ffad0 00007ffa`6fec0b60 ntdll!RtlpEnterCriticalSectionContended+0xd9
      00000012`c65ffb00 00007ff7`6ae0ef0f ntdll!RtlEnterCriticalSection+0x40
      (Inline Function) --------`-------- mysqld!OSMutex::enter+0xa [d:\winx64-packages\build\src\storage\innobase\include\sync0types.h @ 468]
      00000012`c65ffb30 00007ff7`6aec945d mysqld!os_event::wait_low+0x1f [d:\winx64-packages\build\src\storage\innobase\os\os0event.cc @ 330]
      00000012`c65ffb60 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xed [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
      00000012`c65ffba0 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
      00000012`c65ffbd0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
      

      Attachments

        1. mysqld.DMP.bz2.001
          10.00 MB
        2. mysqld.DMP.bz2.002
          10.00 MB
        3. mysqld.DMP.bz2.003
          1.68 MB
        4. mysqld0.DMP.xz.001
          10.00 MB
        5. mysqld0.DMP.xz.002
          6.22 MB
        6. stacks1.txt
          30 kB
        7. stacks2.txt
          30 kB

        Issue Links

          Activity

            Thank you for the explanation.

            I believe that the flow would be a little cleaner if we used condition variables instead of events.

            Can we make the call of os_event_reset() conditional on "there is no work available", similar to how pc_flush_slot() does it?

            		if (page_cleaner->n_slots_requested == 0) {
            			os_event_reset(page_cleaner->is_requested);
            		}
            

            Or even better, would the following patch fix this issue?

            diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc
            index b05b984b3c3..f2cb4e8fdd7 100644
            --- a/storage/innobase/buf/buf0flu.cc
            +++ b/storage/innobase/buf/buf0flu.cc
            @@ -2917,6 +2917,8 @@ pc_flush_slot(void)
             		    && page_cleaner->n_slots_flushing == 0) {
             			os_event_set(page_cleaner->is_finished);
             		}
            +	} else {
            +		os_event_reset(page_cleaner->is_requested);
             	}
             
             	ulint	ret = page_cleaner->n_slots_requested;
            

            marko Marko Mäkelä added a comment - Thank you for the explanation. I believe that the flow would be a little cleaner if we used condition variables instead of events. Can we make the call of os_event_reset() conditional on "there is no work available", similar to how pc_flush_slot() does it? if (page_cleaner->n_slots_requested == 0) { os_event_reset(page_cleaner->is_requested); } Or even better, would the following patch fix this issue? diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc index b05b984b3c3..f2cb4e8fdd7 100644 --- a/storage/innobase/buf/buf0flu.cc +++ b/storage/innobase/buf/buf0flu.cc @@ -2917,6 +2917,8 @@ pc_flush_slot(void) && page_cleaner->n_slots_flushing == 0) { os_event_set(page_cleaner->is_finished); } + } else { + os_event_reset(page_cleaner->is_requested); } ulint ret = page_cleaner->n_slots_requested;

            Yes it does, in case we increase the number of page cleaner threads I would also avoid sending unnecessary is_requested event, see modified patch in
            https://github.com/MariaDB/server/commit/76efb5c361d9551cd6d62d8f56ab459893dbfc91

            jplindst Jan Lindström (Inactive) added a comment - Yes it does, in case we increase the number of page cleaner threads I would also avoid sending unnecessary is_requested event, see modified patch in https://github.com/MariaDB/server/commit/76efb5c361d9551cd6d62d8f56ab459893dbfc91

            For the record: The intended usage of multiple page cleaner threads appears to be to ensure that each buffer pool instance will be flushed without too much delay.

            I started to wonder whether multiple InnoDB buffer pools actually help with any workloads. Yes, it probably was a good idea to split the buffer pool mutex when Inaam Rana introduced multiple buffer pools in MySQL 5.5.5, but since then, there have been multiple fixes to reduce contention on the buffer pool mutex, such as Inaam’s follow-up fix in MySQL 5.6.2 to use rw-locks instead of mutexes for the buf_pool->page_hash. In MySQL 8.0.0, Shaohua Wang implemented one more thing that MariaDB should copy: MDEV-15053 Split buf_pool_t::mutex.

            I think that we should seriously consider removing all code to support multiple buffer pools or page cleaners.
            Should multiple buffer pools be needed in the future (for example, on NUMA machines), it should be designed better from the ground up. Currently the partitioning is arbitrary; buffer pool membership is basically determined by a hash of the page number.

            The description of WL#6642: InnoDB: multiple page_cleaner threads seems to imply that it may have been a mistake to partition the buffer pool.

            Note: partitioning or splitting mutexes often seems to be a good idea. But partitioning data structures or threads might not be.

            marko Marko Mäkelä added a comment - For the record: The intended usage of multiple page cleaner threads appears to be to ensure that each buffer pool instance will be flushed without too much delay. I started to wonder whether multiple InnoDB buffer pools actually help with any workloads. Yes, it probably was a good idea to split the buffer pool mutex when Inaam Rana introduced multiple buffer pools in MySQL 5.5.5 , but since then, there have been multiple fixes to reduce contention on the buffer pool mutex, such as Inaam’s follow-up fix in MySQL 5.6.2 to use rw-locks instead of mutexes for the buf_pool->page_hash . In MySQL 8.0.0, Shaohua Wang implemented one more thing that MariaDB should copy: MDEV-15053 Split buf_pool_t::mutex. I think that we should seriously consider removing all code to support multiple buffer pools or page cleaners. Should multiple buffer pools be needed in the future (for example, on NUMA machines), it should be designed better from the ground up. Currently the partitioning is arbitrary; buffer pool membership is basically determined by a hash of the page number. The description of WL#6642: InnoDB: multiple page_cleaner threads seems to imply that it may have been a mistake to partition the buffer pool. Note: partitioning or splitting mutexes often seems to be a good idea. But partitioning data structures or threads might not be.

            I filed MDEV-15058 Remove multiple InnoDB buffer pool instances.
            jplindst, please rebase your fix to the latest 10.3. I pushed some related cleanup (prevent a potential infinite loop, and a potential hang at shutdown) to 10.2 and merged up to 10.3.

            marko Marko Mäkelä added a comment - I filed MDEV-15058 Remove multiple InnoDB buffer pool instances. jplindst , please rebase your fix to the latest 10.3. I pushed some related cleanup (prevent a potential infinite loop, and a potential hang at shutdown) to 10.2 and merged up to 10.3.
            jplindst Jan Lindström (Inactive) added a comment - Done in: https://github.com/MariaDB/server/commit/1b1e0c0a1e47e49531aaef5b7df76c08c1c9d519

            People

              marko Marko Mäkelä
              wlad Vladislav Vaintroub
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.