Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.5, 10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL)
Description
To reduce the rate of the infamous 'fake hangs' in rr replay I experimented with the following patch:
diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
|
index 58626c4667c..5fecbf15453 100644
|
--- a/storage/innobase/handler/ha_innodb.cc
|
+++ b/storage/innobase/handler/ha_innodb.cc
|
@@ -19525,7 +19525,7 @@ static MYSQL_SYSVAR_UINT(read_io_threads, srv_n_read_io_threads,
|
static MYSQL_SYSVAR_UINT(write_io_threads, srv_n_write_io_threads,
|
PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
|
"Number of background write I/O threads in InnoDB.",
|
- NULL, NULL, 4, 2, 64, 0);
|
+ NULL, NULL, 4, 1, 64, 0);
|
|
static MYSQL_SYSVAR_ULONG(force_recovery, srv_force_recovery,
|
PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
|
diff --git a/storage/innobase/include/os0file.h b/storage/innobase/include/os0file.h
|
index 5a8a8dd7fd9..673239d9468 100644
|
--- a/storage/innobase/include/os0file.h
|
+++ b/storage/innobase/include/os0file.h
|
@@ -260,7 +260,7 @@ struct os_file_size_t {
|
os_offset_t m_alloc_size;
|
};
|
|
-constexpr ulint OS_AIO_N_PENDING_IOS_PER_THREAD= 256;
|
+constexpr ulint OS_AIO_N_PENDING_IOS_PER_THREAD= 1;
|
|
extern Atomic_counter<ulint> os_n_file_reads;
|
extern ulint os_n_file_writes; |
With this patch, the following invocation would hang on bootstrap:
./mtr --boot-rr --mysqld=--innodb-write-io-threads=1 main.1st
|
The first reason turns out to be missing signaling of a condition variable:
diff --git a/tpool/tpool_structs.h b/tpool/tpool_structs.h
|
index b49204f2d75..96c250ebd59 100644
|
--- a/tpool/tpool_structs.h
|
+++ b/tpool/tpool_structs.h
|
@@ -138,12 +138,13 @@ template<typename T> class cache
|
{
|
std::unique_lock<std::mutex> lk(m_mtx);
|
assert(!is_full());
|
+ const bool was_empty= is_empty();
|
// put element to the logical end of the array
|
m_cache[--m_pos] = ele;
|
|
/* Notify waiters when the cache becomes
|
not empty, or when it becomes full */
|
- if (m_pos == 1 || (m_waiters && is_full()))
|
+ if (was_empty || (is_full() && m_waiters))
|
m_cv.notify_all();
|
}
|
|
As far as I can tell, the above condition was broken in MDEV-29367.
Even with the above patch applied, the bootstrap would hang. The remaining reason is MDEV-24313 and innodb_doublewrite=ON. So, we must retain the minimum value of innodb_write_io_threads=2 and discard the first hunk of our instrumentation patch.
A further problem would be that many encryption related tests would end up in an infinite loop in buf_pool_t::io_buf_t::reserve(), which is something that had been last changed in MDEV-26055. The following patch would fix that:
@@ -1513,14 +1513,17 @@ void buf_pool_t::io_buf_t::close()
|
n_slots= 0;
|
}
|
|
-buf_tmp_buffer_t *buf_pool_t::io_buf_t::reserve()
|
+buf_tmp_buffer_t *buf_pool_t::io_buf_t::reserve(bool wait_for_reads)
|
{
|
for (;;)
|
{
|
for (buf_tmp_buffer_t *s= slots, *e= slots + n_slots; s != e; s++)
|
if (s->acquire())
|
return s;
|
+ buf_dblwr.flush_buffered_writes();
|
os_aio_wait_until_no_pending_writes();
|
+ if (!wait_for_reads)
|
+ continue;
|
for (buf_tmp_buffer_t *s= slots, *e= slots + n_slots; s != e; s++)
|
if (s->acquire())
|
return s; |
When this function is being called from buf_page_decrypt_after_read(), we must pass wait_for_reads=false, because the current thread actually is executing in a read completion callback, and therefore os_aio_wait_until_no_pending_reads(); after the continue; would hang. In any case, we must schedule a write of a possibly partial doublewrite batch so that os_aio_wait_until_no_pending_writes(); has a chance of finishing.
With these fixes, a run with OS_AIO_N_PENDING_IOS_PER_THREAD=1 and
./mtr --big-test --force --parallel=auto --mysqld=--loose-innodb-read-io-threads=1 --mysqld=--loose-innodb-write-io-threads=2
|
completes fine in both 10.5 and 10.6, with the exception of the test sys_vars.innodb_read_io_threads_basic.
The motivation of setting these parameters is to give fewer chances to rr record to schedule threads in an unfair fashion that often leads to 'fake hangs' where write_slots->m_cache.m_pos (or less often read_slots->m_cache.m_pos) would be nonzero and some InnoDB threads are waiting for an extremely long time for a buffer page latch.
Attachments
Issue Links
- is caused by
-
MDEV-26055 Adaptive flushing is still not getting invoked in 10.5.11
- Closed
-
MDEV-29367 Refactor tpool::cache
- Closed
- relates to
-
MDEV-34641 MariaDB server getting stuck when ntp server switch time in midst of start up
- Open