[MDEV-31095] Create separate tpool thread for async aio Created: 2023-04-20  Updated: 2023-10-05  Resolved: 2023-10-04

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: None
Fix Version/s: 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3, 11.2.2

Type: Bug Priority: Major
Reporter: Michael Widenius Assignee: Vladislav Vaintroub
Resolution: Fixed Votes: 0
Labels: performance

Issue Links:
Relates
relates to MDEV-11378 AliSQL: [Perf] Issue#23 MERGE INNODB ... Open
relates to MDEV-16264 Implement a common work queue for Inn... Closed

 Description   

Linux mutex does not ensure that mutex are given in a FIFO order, but instead allows 'fast threads' to steal the mutex, which causes other threads to starve.

This is a problem with tpool as if there is burst of aio_reads, it will create 'nproc*2' threads to handle the request even if one thread could do the job (if the block is on a very fast device or on a memory device). If the file is on a hard disk things may be even worse.

This can be seen by just starting the server on a data directory with a large 'ib_buffer_pool' file.
In this case the startup code will on my machine create 72 threads on startup to fill the
buffer pool, which is not a good idea for most system (especially desktops) as the memory
used may not be released to the operating system.

In addition the current code does not honor the variables srv_n_read_io_threads or srv_n_write_io_threads.

The suggested fix is to use a separate tpool for just async io and use a limited number of threads for this. Here is some suggested code to use as a base:

--- b/storage/innobase/srv/srv0srv.cc
+++ b/storage/innobase/srv/srv0srv.cc
@@ -580,12 +580,13 @@ static void thread_pool_thread_end()
 
 void srv_thread_pool_init()
 {
+  uint max_threads= srv_n_read_io_threads + srv_n_write_io_threads;
   DBUG_ASSERT(!srv_thread_pool);
 
 #if defined (_WIN32)
-  srv_thread_pool= tpool::create_thread_pool_win();
+  srv_thread_pool= tpool::create_thread_pool_win(1, max_threads);
 #else
-  srv_thread_pool= tpool::create_thread_pool_generic();
+  srv_thread_pool= tpool::create_thread_pool_generic(1, max_threads);
 #endif
   srv_thread_pool->set_thread_callbacks(thread_pool_thread_init,
                                         thread_pool_thread_end);



 Comments   
Comment by Marko Mäkelä [ 2023-05-29 ]

wlad created a branch with 3 commits related to this.

Curiously, this change to make preloading the buffer pool use a single thread would seem to conceptually revert MDEV-26547 and possibly impact MDEV-25417.

monty, can you please test these changes, one by one? Based on recent experience from MDEV-31343, I expect that the results may vary depending on the GNU libc version or the thread scheduler in the Linux kernel.

Comment by Vladislav Vaintroub [ 2023-05-29 ]

Apparently, the performance did not suffer after using single thread, and was actually slightly improved on 10GB preload I tried. I did perform a performance test, so the claim that it would be better, is founded Conceptually, it is not single as threaded as it used to be. there is a thread that submits async IO, there are threads that handles IO, and there are 1024 async IOs in-flight.

Comment by Marko Mäkelä [ 2023-05-29 ]

wlad, right, there would be 2 threads involved. Related to this, perhaps you could comment on MDEV-11378. An idea is that for reading pages for applying logs or warming up the buffer pool, it might be better to allocate N contiguous page frame addresses from the buffer pool, and then submit a smaller number of normal or scatter-gather read requests for multiple pages at once.

Comment by Marko Mäkelä [ 2023-09-21 ]

In my limited test, comparing the latest 10.6 with a merge of the branch, I did not observe any noticeable performance impact. I did not test the loading of a buffer pool.

Comment by Vladislav Vaintroub [ 2023-10-04 ]

I'm assigning that to me,as tpool is something I'm maintaining.
Some feedback to the bug report

  • many threads are results of innodb batch processing - rapid async io during bufferpool load, or rapid task submits during recovery.
  • this batch processing is unique to the startup, and does not happen during regular database run.
  • therefore there is no need in maintaining multiple threadpools, as proposed by Monty. It makes sense to make threadpool less eager to create threads exactly when the batch processing is running. multiple threadpools is what we want to get rid of, and we do not really want to deal with cross-threadpool deadlocks or such.
Comment by Vladislav Vaintroub [ 2023-10-04 ]

Fixed "too many threads" by making tpool less eager to create additional threads during Innodb batches of asynchronous work during startup. Loading 10GB buffer pool would now require 6-7 threads overall in tpool, down from previous ncpus*2=32(on my machine).

Innodb is currently stress-testing tpool during startup and recovery, with load that otherwise would not happen. It did not happen previously either, when recovery or buffer pool load were less multithreaded.

Generated at Thu Feb 08 10:21:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.