[MDEV-22359] tpool - add variables to restrict concurrency and max threads Created: 2020-04-23 Updated: 2020-09-02 Resolved: 2020-09-02 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Server, Storage Engine - InnoDB |
| Fix Version/s: | N/A |
| Type: | Task | Priority: | Major |
| Reporter: | Vladislav Vaintroub | Assignee: | Vladislav Vaintroub |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Description |
|
monty said there are too many for him in the gdb. marko lamented need to scroll gdb when debugging. Tpool was not written with the goal to easy gdb debugging. Its main goal up to now, was to keep the "concurrency level" number of threads on CPU, where concurrency level was deduced from std::thread::hardware_concurrency But to optimize another objective, happiness of the people debugging the server, we can introduce the limits. The concurrency level, other than std::thread::hardware_concurrency might actually be needed, to account for non-default process affinity. |
| Comments |
| Comment by Vladislav Vaintroub [ 2020-04-23 ] |
|
@marko, please review 79280cf5dd22082373595aa0dc6178e2483727ca |
| Comment by Sergei Golubchik [ 2020-04-30 ] |
|
wlad, one option wold be to use small hard-coded limits if --gdb was specified. Or add debug-build-only options. Both approaches would avoid adding yet another element to the HUGE list of options that users can possibly tune. |
| Comment by Vladislav Vaintroub [ 2020-04-30 ] |
|
That's right, we have huge list of stuff users can tune. |
| Comment by Sergei Golubchik [ 2020-04-30 ] |
|
I'd say yes. One more option is one more something a user need to know about, to learn, and to remember. If something can be configured automatically, it should be configured automatically. There is no tricky hidden option that I know of. |
| Comment by Vladislav Vaintroub [ 2020-04-30 ] |
|
PLUGIN_VAR_NOSYSVAR , was not that invisible as system variable? This was the tricky I meant |
| Comment by Sergei Golubchik [ 2020-04-30 ] |
|
Ah, yes. But it's still visible as a command-line option. |
| Comment by Vladislav Vaintroub [ 2020-04-30 ] |
|
Alright, I'd like to get the opinions by monty, since it was him, who requested the feature in the first place, as system variable. |
| Comment by Marko Mäkelä [ 2020-05-12 ] |
|
Thanks, it looks OK to me; I sent a couple cosmetic comments. |
| Comment by Marko Mäkelä [ 2020-05-12 ] |
|
I agree with serg, we already have way too many parameters, and we do not have a good policy of deprecating and removing parameters in the future, as subsystems are refactored and old parameters become obsolete. If the main motivation is to reduce the number of threads when debugging, maybe instead of introducing these parameters, we could tie it to opt_debugging, so that when the server is invoked as mariadbd --gdb, fewer threads will be created initially. |
| Comment by Vladislav Vaintroub [ 2020-05-12 ] |
|
@marko, initially, 0 threads are created, apart from the Linux AIO one. The threads are created when there are tasks, and there are no threads to serve them, and the current concurrency is less that desired concurrency, and desired concurrency depends on number of cores. And then also, there is a throttling for the thread creation, although throttling is activated based on current number of threads,and again, that logic depends on the number of cores. The more cores you have, the more threads are created without throttling |
| Comment by Marko Mäkelä [ 2020-05-12 ] |
|
wlad, can you please set a breakpoint to the code where the thread pool creates a new thread, extract stack traces for each invocation, and then tabulate and summarize the reasons for creating new threads at server startup (for a trivial test that only creates and drops an InnoDB table)? Maybe we could fight the thread creation at the source. (That might end up being a separate bug or task, possibly not doable in a stable release branch.) |
| Comment by Vladislav Vaintroub [ 2020-05-12 ] |
|
@Marko, I do not know that much about gdb automation? If you can, set such a breakpoint in std::thread::detach(), and run your test cases. |
| Comment by Vladislav Vaintroub [ 2020-05-12 ] |
|
Tickers/timer tasks are known to create (possibly excessive) threads, also because they seem to run at about the same time. |
| Comment by Marko Mäkelä [ 2020-05-12 ] |
|
For a minimal test, before ha_innobase::create() was invoked, there were 5 invocations of std::thread::detach(), 4 from process_timers() in timer_handler() and 1 from trx_purge_wait_for_workers_to_complete() in purge_coordinator_callback(). The timer_handler() invocations were there even if I started the server with innodb_purge_threads=1. How can I view the initiator of the timer_handler() calls? A little later into the test, we got a large number of std::thread::detach() invocations from tpool::aio_linux::getevent_thread_routine(). wlad, could that be limited somehow? |
| Comment by Vladislav Vaintroub [ 2020-05-12 ] |
|
srv_thread_pool->set_timer is the initiator of the timer_handler calls. a large number of threads from tpool::aio_linux::getevent_thread_routine probably means that IO callbacks is something I can look at. there is a task_group specific queue, and maybe we could populate it, if all IO write threads are busy (we have limited number of parallel executed tasks in group), instead of pushing task into common queue, and waking threads. I can try to provide a patch |
| Comment by Marko Mäkelä [ 2020-05-12 ] |
|
wlad, I believe that you were referring to create_timer. The following calls to it look unavoidable:
However, the following could be seen as omissions of
|
| Comment by Vladislav Vaintroub [ 2020-05-12 ] |
|
marko, nope, I'm referring to set_time(). Creating timer does almost nothing. set_time() does something |
| Comment by Vladislav Vaintroub [ 2020-05-12 ] |
|
Following periodic timers are permanent, and are started when the server starts, if server is not readonly. If those do not deadlock each other, and are not time-critical, you can merge them, I.e have a single timer, which runs every second, executes srv_master_callback, srv_error_monitor_task, and on every 5th run, executes srv_monitor_task. Others are started when needed. I think. |
| Comment by Marko Mäkelä [ 2020-05-12 ] |
|
It seems to me that the following calls could be deferred, like I suggested in my previous comment:
|
| Comment by Vladislav Vaintroub [ 2020-05-12 ] |
|
all these timers are is not periodic. If they were periodic, the second parameter to set_time would be != 0. The first parameter to set_time is initial delay, the second is period. So, this is a use of timer, that removes "sleeps" From the comment to the btr_defragment_chunk(void*) Throttling "sleep", is implemented via rescheduling the threadpool timer, which, when fired, will resume the work again, where it is left. I believe it is the same thing for other cases as well. I think, is better not to use a thread for sleep, but reschedule work for later. This way, less threads can be used. |
| Comment by Vladislav Vaintroub [ 2020-09-02 ] |
|
It did not find broad support, apparently @serg was not excited. I'll close it for now |