Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Fixed
-
10.6.15
Description
The follow-up fix referenced in a comment on MDEV-26827 changed buf_flush_page_cleaner to pass an additional parameter pct_lwm to page_cleaner_flush_pages_recommendation, documented as innodb_max_dirty_pages_pct_lwm, or 0 to ignore it.
As far as I am able to tell from the code, there is no code path where pct_lwm can ever be anything but 0. Here is the relevant code as of 10.6.15, stripped down to relevant parts and control flow constructs:
double pct_lwm= 0.0; |
// ... |
pct_lwm= srv_max_dirty_pages_pct_lwm;
|
if (pct_lwm != 0.0) |
{
|
// ... |
if (activity_count != last_activity_count) |
{
|
// ... |
goto maybe_unemployed; |
}
|
else if (buf_pool.page_cleaner_idle() && !os_aio_pending_reads()) |
{
|
// ... |
goto idle_flush; |
}
|
else |
{
|
maybe_unemployed:
|
const bool below{dirty_pct < pct_lwm}; |
pct_lwm= 0.0;
|
if (below) |
goto possibly_unemployed; |
}
|
}
|
else if (dirty_pct < srv_max_buf_pool_modified_pct) |
possibly_unemployed:
|
if (!soft_lsn_limit && !af_needed_for_redo(oldest_lsn)) |
goto unemployed; |
These are the only assignments to pct_lwm, which is then passed to page_cleaner_flush_pages_recommendation shortly after this block.
It appears the only way to have pct_lwm to be non-zero at the end of this block is:
1. Have srv_max_dirty_pages_pct_lwm non-zero
2. Make it out of the if (pct_lwm != 0.0) block without hitting the pct_lwm= 0.0 line in the else case.
But #2 appears to be impossible: the inner if case does a goto to the body of the else case (which zeros the value), and the else if case jumps to idle_flush which bypasses the page_cleaner_flush_pages_recommendation call.
Inside page_cleaner_flush_pages_recommendation we have:
double total_ratio; |
if (pct_lwm == 0.0 || max_pct == 0.0) { |
total_ratio = 1;
|
} else { |
total_ratio = std::max(double(pct_for_lsn) / 100, |
(dirty_pct / max_pct));
|
}
|
If pct_lwm is always 0, then total_ratio is always 1 and dirty_pct and pct_for_lsn have no effect.
I had a little bit of trouble tracking all the possible control flow paths in buf_flush_page_cleaner due to all the gotos, so I tested this theory by adding DBUG_ASSERT(pct_lwm == 0.0) immediately before the previous snippet. The debug build with that assert passes:
./mtr --mariadbd=--innodb-max-dirty-pages-pct-lwm=0.001 --mariadbd=--innodb-adaptive-flushing-lwm=0.001 --suites=innodb-
|
which I would expect to trip that assert fairly quickly?
I started looking into this because I noticed that flushing on our databases (on 10.6.14) didn't seem to be speeding up as the checkpoint age approached max_modified_age_async, leading to periodic spikes of rapid flushing. I went reading through the code to try to understand what was going on and noticed this, but I'm not sure it's the only factor.
Attachments
Issue Links
- is caused by
-
MDEV-26055 Adaptive flushing is still not getting invoked in 10.5.11
- Closed