[MDEV-11026] Make InnoDB number of IO write/read threads dynamic Created: 2016-10-11 Updated: 2023-03-21 Resolved: 2022-08-16 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Fix Version/s: | 10.11.0 |
| Type: | Task | Priority: | Critical |
| Reporter: | Jan Lindström (Inactive) | Assignee: | Vladislav Vaintroub |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | Preview_10.11 | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
Currently static variables and require server shutdown and restart to be able to change
|
| Comments |
| Comment by Marko Mäkelä [ 2018-01-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Instead of doing this, I would prefer a thorough analysis of how the page flushing currently works in InnoDB and how it could be improved. Do we really need so many threads being idle, or would it suffice to have one I/O handler thread that collects completed requests? For pending reads, perhaps it would be more efficient for buf_page_get_gen() to submit a read request and then wait for the completion, in that same thread? Currently we are sleeping and retrying while the actual work happens in other threads. For pending writes, perhaps the thread that goes through buf_pool->flush_list should asynchronously submit the write requests, and pause for collecting completion notifications if the queue of pending requests grows long enough. Do we really need so many mostly idle threads? And do we need these parameters at all? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-05-10 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
A test failure needs to be addressed:
Furthermore, I can observe many compilation failures, due to an unfortunately chosen base revision. It would be better to rebase this on something more recent, to get wider regression test coverage. Once rebased, this will have to be stress tested. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2022-05-12 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
rebased and eliminated a warning in test | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-05-24 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think that this needs some stress testing. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2022-05-31 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2022-05-31 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
mleich, addressed in fad5f3a6caf34d9 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2022-06-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-06-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
mleich, I have updated the branch with the latest 10.10. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2022-06-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-06-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
mleich, thank you. I checked each rr replay trace as well as the core dump. First, let me mention failure 2, which is definitely not an InnoDB problem. Can you please file a separate bug for it, tentatively for the partitioning storage engine? The options was freed in:
A subsequent operation is attempting to read the freed memory:
I expect this bug to affect older releases. While the TRUNCATE in InnoDB was refactored in 10.6, it has been similar ever since The remaining failures are related to the InnoDB FULLTEXT INDEX implementation, of which thiru is the subsystem owner (and should file separate bug(s) for these). In Failure 1, the table was not dropped, but the cached metadata was freed in innobase_reload_table() during ALTER TABLE. There is a race condition between two InnoDB threads that handle FULLTEXT INDEX:
That is, during the time ASAN already noticed that slot->table is pointing to freed memory, another thread is assigning slot->table to nullptr. Failures 3 and 4 seem to be due to the same problem: the slot passed to fts_optimize_table_bk() points to freed memory. With the rr trace, it should be easy to debug this in more detail:
The memory was freed during ha_innobase::delete_table(), which was executed as part of ddl_recovery.log processing. I think that this bug should affect 10.6 already. One possible fix to this failure might be to not schedule any FTS tasks before DDL recovery has been completed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2022-06-23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Failure 2 is is now reported as https://jira.mariadb.org/browse/MDEV-28937 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-06-28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I realized that at least my earlier comment about Failure 1 shows multiple instances of fts_optimize_callback executing concurrently, which is very wrong. That should have the same underlying cause as | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2022-06-28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|