[MDEV-15016] multiple page cleaner threads use a lot of CPU on idle server Created: 2018-01-20  Updated: 2018-01-25  Resolved: 2018-01-25

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.3.4
Fix Version/s: 10.2.13, 10.3.5

Type: Bug Priority: Blocker
Reporter: Vladislav Vaintroub Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Attachments: File mysqld.DMP.bz2.001     File mysqld.DMP.bz2.002     File mysqld.DMP.bz2.003     File mysqld0.DMP.xz.001     File mysqld0.DMP.xz.002     Text File stacks1.txt     Text File stacks2.txt    
Issue Links:
Problem/Incident
is caused by MDEV-11025 Make number of page cleaner threads v... Closed
Relates
relates to MDEV-15053 Reduce buf_pool_t::mutex contention Closed
relates to MDEV-15058 Remove multiple InnoDB buffer pool in... Closed
relates to MDEV-9784 High CPU Usage When Absolutely Idle o... Closed
relates to MDEV-11738 Mariadb uses 100% of several of my 8 ... Closed
Sprint: 10.1.31

 Description   

Issue described in https://lists.launchpad.net/maria-discuss/msg04975.html as follows
"
On a 4-core (4 physical, 8 logical cores), 3.5 Ghz box, MariaDB 10.3.4
consumes typically 35-40% of available CPU even when completely idle
according to Task Manager. Please see image.

I cannot tell if previous 10.3 versions also did, as I don't have 10.3
running constantly in the backgorund. I just observed this now. This does
not happen with MariaDB 10.1, MariaDB 10.2, MySQL 5.6, MySQL 5.7 and MySQL
8.0 (all those I have starting with Windows on this box), where the
CPU-load is neglible unless the server is "doing real work". As a result, the CPU clock frequency is always highest possible 3.5 Ghz
(this CPU model is a 4th generation Intel with the option to scale down CPU
frequency to around 40-50% of max. when there is no need for more) when
MariaDB 10.3.4 is running.

"

Examination of the attached stacktraces taken from the process when it in this "idle" state, shows 3 CPUs being occupied by something, that for most time seems to be some busy-look acquiring innodb mutex
.
my.ini uses all defaults, with the exception of
innodb_buffer_pool_size=4091M

Setting innodb_page_cleaners = 1, according to reporter, fixed the issue.

Relevant callstack information :

 
. 15  Id: 2ce8.2d38 Suspend: 0 Teb: 0000005f`55863000 Unfrozen
      Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
      Priority: 0  Priority class: 32  Affinity: ff
Child-SP          RetAddr           Call Site
0000005f`56cffd40 00007ff7`6aec945d ntdll!RtlLeaveCriticalSection+0x29
0000005f`56cffd70 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xed [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
0000005f`56cffdb0 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
0000005f`56cffde0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
 
. 16  Id: 2ce8.2d3c Suspend: 0 Teb: 0000005f`55865000 Unfrozen
      Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
      Priority: 0  Priority class: 32  Affinity: ff
Child-SP          RetAddr           Call Site
0000005f`56dff6c0 00007ff7`6aec8503 mysqld!PolicyMutex<TTASEventMutex<GenericPolicy> >::enter+0x26 [d:\winx64-packages\build\src\storage\innobase\include\ib0mutex.h @ 622]
0000005f`56dff740 00007ff7`6aec944b mysqld!pc_flush_slot+0x43 [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 2841]
0000005f`56dff7b0 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xdb [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
0000005f`56dff7f0 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
0000005f`56dff820 00000000`00000000 ntdll!RtlUserThreadStart+0x21
 
. 17  Id: 2ce8.2d40 Suspend: 0 Teb: 0000005f`55867000 Unfrozen
      Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
      Priority: 0  Priority class: 32  Affinity: ff
Child-SP          RetAddr           Call Site
0000005f`56eff750 00007ff7`6aec8503 mysqld!PolicyMutex<TTASEventMutex<GenericPolicy> >::enter+0x57 [d:\winx64-packages\build\src\storage\innobase\include\ib0mutex.h @ 635]
0000005f`56eff7d0 00007ff7`6aec944b mysqld!pc_flush_slot+0x43 [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 2841]
0000005f`56eff840 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xdb [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
0000005f`56eff880 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
0000005f`56eff8b0 00000000`00000000 ntdll!RtlUserThreadStart+0x21

. 18  Id: c10.1c30 Suspend: 0 Teb: 00000012`c5100000 Unfrozen
      Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
      Priority: 0  Priority class: 32  Affinity: ff
Child-SP          RetAddr           Call Site
00000012`c63ff970 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xe8 [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
00000012`c63ff9b0 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
00000012`c63ff9e0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
 
. 19  Id: c10.a04 Suspend: 0 Teb: 00000012`c5102000 Unfrozen
      Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
      Priority: 0  Priority class: 32  Affinity: ff
Child-SP          RetAddr           Call Site
00000012`c64ff6d0 00007ff7`6ae0ef0f ntdll!RtlEnterCriticalSection+0x13
(Inline Function) --------`-------- mysqld!OSMutex::enter+0xa [d:\winx64-packages\build\src\storage\innobase\include\sync0types.h @ 468]
00000012`c64ff700 00007ff7`6aec945d mysqld!os_event::wait_low+0x1f [d:\winx64-packages\build\src\storage\innobase\os\os0event.cc @ 330]
00000012`c64ff730 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xed [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
00000012`c64ff770 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
00000012`c64ff7a0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
 
. 20  Id: c10.161c Suspend: 0 Teb: 00000012`c5104000 Unfrozen
      Start: mysqld!buf_flush_page_cleaner_worker (00007ff7`6aec9370)
      Priority: 0  Priority class: 32  Affinity: ff
Child-SP          RetAddr           Call Site
00000012`c65ff9e8 00007ffa`6feec17f ntdll!NtWaitForAlertByThreadId+0x14
00000012`c65ff9f0 00007ffa`6feebf4b ntdll!RtlpWaitOnAddressWithTimeout+0x43
00000012`c65ffa20 00007ffa`6fec0c49 ntdll!RtlpWaitOnCriticalSection+0x15f
00000012`c65ffad0 00007ffa`6fec0b60 ntdll!RtlpEnterCriticalSectionContended+0xd9
00000012`c65ffb00 00007ff7`6ae0ef0f ntdll!RtlEnterCriticalSection+0x40
(Inline Function) --------`-------- mysqld!OSMutex::enter+0xa [d:\winx64-packages\build\src\storage\innobase\include\sync0types.h @ 468]
00000012`c65ffb30 00007ff7`6aec945d mysqld!os_event::wait_low+0x1f [d:\winx64-packages\build\src\storage\innobase\os\os0event.cc @ 330]
00000012`c65ffb60 00007ffa`6d831fe4 mysqld!buf_flush_page_cleaner_worker+0xed [d:\winx64-packages\build\src\storage\innobase\buf\buf0flu.cc @ 3574]
00000012`c65ffba0 00007ffa`6ff0efb1 kernel32!BaseThreadInitThunk+0x14
00000012`c65ffbd0 00000000`00000000 ntdll!RtlUserThreadStart+0x21



 Comments   
Comment by Vladislav Vaintroub [ 2018-01-21 ]

Reproducible with any configuration that results in innodb_page_cleaners > 1.
Using default settings, innodb_buffer_pool_size > 1G would trigger this issue (also innodb_buffer_pool_instances needs to be > 0, which is automatically triggered by innodb_buffer_pool_size > 1G).

By default, installer uses 1/8 of RAM, i.e 1GB when RAM >= 8GB, which nowadays would be true even for most laptops.

Comment by Elena Stepanova [ 2018-01-21 ]

Two commits seem to be involved.
First,

commit b23a1096956c21df037bd851494f11509b5514dd
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date:   Sat Sep 9 11:34:12 2017 +0300
 
    MDEV-11025: Make number of page cleaner threads variable dynamic

made server hang on startup with innodb-buffer-pool-instances > 1.
Then,

commit 17529785f1a8e494e027ca658d318b842e2c8189
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Fri Nov 24 13:26:15 2017 +0400
 
    Fix hang in buf_flush_set_page_cleaner_thread_cnt

fixed the hang, but left the server with this CPU usage problem.
Thus assigning to jplindst as the author of the first. FYI svoj.

Comment by Marko Mäkelä [ 2018-01-21 ]

elenst tested the latest 10.3 after reverting the MDEV-11025 commit, and the CPU usage seems to be gone.

I think that we should revert MDEV-11025, because the ability to adjust the number of the page cleaning threads on a running server was not requested by a customer.

Comment by Jan Lindström (Inactive) [ 2018-01-22 ]

https://github.com/MariaDB/server/commit/6a8e070d62db375500ed856f9ad5034564447eee

Comment by Marko Mäkelä [ 2018-01-22 ]

I would like to see a complete high-level design that covers the communication between the threads: at startup, normal operation, shutdown, and when the requested number of threads differs from the current number of threads.

I am concerned that as a result of the proposed patch, the page cleaner threads might under some circumstances stop doing useful work.

Comment by Jan Lindström (Inactive) [ 2018-01-23 ]

Startup is same as increasing number of page cleaner threads i.e. we create threads and start waiting is_started events until requested number of threads is reached. When page cleaner thread starts it increases number of running threads and sends a is_started event and starts to wait for work in is_requested event.

When number of page cleaner threads decrease we send is_requested event i.e. this is similar as in shutdown there is "work" to be done. Again we wait these threads to exit on is_started event. Page cleaner threads that id is less than n_workers will reset is_requested event as there is no "work" to be done and send is_started event before decreasing number of running threads and exists.

After several tries I do not know fix that would not use os_event_reset(), if I remove it either when thread number is decreased we are in busy loop or when they are increased we are again on busy loop.

Based on my testing both of added os_event_reset() are needed or bad things happen.

I do not see a problem using os_event_reset as it is already used e.g. buf0flu.cc line 2864 when no real work is available.

Comment by Marko Mäkelä [ 2018-01-23 ]

Thank you for the explanation.

I believe that the flow would be a little cleaner if we used condition variables instead of events.

Can we make the call of os_event_reset() conditional on "there is no work available", similar to how pc_flush_slot() does it?

		if (page_cleaner->n_slots_requested == 0) {
			os_event_reset(page_cleaner->is_requested);
		}

Or even better, would the following patch fix this issue?

diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc
index b05b984b3c3..f2cb4e8fdd7 100644
--- a/storage/innobase/buf/buf0flu.cc
+++ b/storage/innobase/buf/buf0flu.cc
@@ -2917,6 +2917,8 @@ pc_flush_slot(void)
 		    && page_cleaner->n_slots_flushing == 0) {
 			os_event_set(page_cleaner->is_finished);
 		}
+	} else {
+		os_event_reset(page_cleaner->is_requested);
 	}
 
 	ulint	ret = page_cleaner->n_slots_requested;

Comment by Jan Lindström (Inactive) [ 2018-01-24 ]

Yes it does, in case we increase the number of page cleaner threads I would also avoid sending unnecessary is_requested event, see modified patch in
https://github.com/MariaDB/server/commit/76efb5c361d9551cd6d62d8f56ab459893dbfc91

Comment by Marko Mäkelä [ 2018-01-24 ]

For the record: The intended usage of multiple page cleaner threads appears to be to ensure that each buffer pool instance will be flushed without too much delay.

I started to wonder whether multiple InnoDB buffer pools actually help with any workloads. Yes, it probably was a good idea to split the buffer pool mutex when Inaam Rana introduced multiple buffer pools in MySQL 5.5.5, but since then, there have been multiple fixes to reduce contention on the buffer pool mutex, such as Inaam’s follow-up fix in MySQL 5.6.2 to use rw-locks instead of mutexes for the buf_pool->page_hash. In MySQL 8.0.0, Shaohua Wang implemented one more thing that MariaDB should copy: MDEV-15053 Split buf_pool_t::mutex.

I think that we should seriously consider removing all code to support multiple buffer pools or page cleaners.
Should multiple buffer pools be needed in the future (for example, on NUMA machines), it should be designed better from the ground up. Currently the partitioning is arbitrary; buffer pool membership is basically determined by a hash of the page number.

The description of WL#6642: InnoDB: multiple page_cleaner threads seems to imply that it may have been a mistake to partition the buffer pool.

Note: partitioning or splitting mutexes often seems to be a good idea. But partitioning data structures or threads might not be.

Comment by Marko Mäkelä [ 2018-01-24 ]

I filed MDEV-15058 Remove multiple InnoDB buffer pool instances.
jplindst, please rebase your fix to the latest 10.3. I pushed some related cleanup (prevent a potential infinite loop, and a potential hang at shutdown) to 10.2 and merged up to 10.3.

Comment by Jan Lindström (Inactive) [ 2018-01-24 ]

Done in:

https://github.com/MariaDB/server/commit/1b1e0c0a1e47e49531aaef5b7df76c08c1c9d519

Generated at Thu Feb 08 08:18:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.