While testing innodb.recovery_memory related to MDEV-31350 and MDEV-31353, I got a surprise crash near the end of my test run of a RelWithDebInfo executable:
Based on the core dump, it looks like the comparison function may have been invoked on the first out-of-bounds element list.get() + size.
The size is equal to buf_pool.flush_list.count, and all elements in the array list.get() are nonnull. The address immediately after the last element (the second argument passed to std::sort()) contains a null pointer.
To me, this looks like a possible error in the implementation of std::sort() in this version of libstdc+. The executable had been compiled with clang-15, but the option -stdlib=libc+ was not used.
I think that more investigation will be needed on this. There is also other similar use of std::sort() in InnoDB.
Attachments
Issue Links
relates to
MDEV-25113Reduce effect of parallel background flush on select workload
Closed
MDEV-31791Crash recovery in the test innodb.recovery_memory occasionally fails
Closed
MDEV-35225Bogus debug assertion failures in innodb.innodb-32k-crash
Closed
MDEV-27022Buffer pool is being flushed during recovery
Closed
MDEV-31350test innodb.recovery_memory failed on '21 failed attempts to flush a page'
Closed
MDEV-31353InnoDB recovery hangs after reporting corruption
Closed
MDEV-32029Assertion failures in log_sort_flush_list upon crash recovery
2023-07-27 12:34:36 0 [Note] InnoDB: End of log at LSN=315636465
2023-07-27 12:34:36 0 [Note] InnoDB: To recover: 279 pages
2023-07-27 12:34:36 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=239, page number=110]
2023-07-27 12:34:36 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
Hopefully this revised wait loop does the trick and ensures that no page writes are in progress while we sort the buf_pool.flush_list:
for (;;)
{
os_aio_wait_until_no_pending_writes(false);
mysql_mutex_lock(&buf_pool.flush_list_mutex);
if (buf_pool.flush_list_active())
my_cond_wait(&buf_pool.done_flush_list,
&buf_pool.flush_list_mutex.m_mutex);
elseif (!os_aio_pending_writes())
break;
mysql_mutex_unlock(&buf_pool.flush_list_mutex);
}
Marko Mäkelä
added a comment - Another failure with the above buggy patch:
CURRENT_TEST: innodb.recovery_memory
mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
…
2023-07-27 12:34:36 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=311491401
2023-07-27 12:34:36 0 [Note] InnoDB: End of log at LSN=315636465
2023-07-27 12:34:36 0 [Note] InnoDB: To recover: 279 pages
2023-07-27 12:34:36 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=239, page number=110]
2023-07-27 12:34:36 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
Hopefully this revised wait loop does the trick and ensures that no page writes are in progress while we sort the buf_pool.flush_list :
for (;;)
{
os_aio_wait_until_no_pending_writes( false );
mysql_mutex_lock(&buf_pool.flush_list_mutex);
if (buf_pool.flush_list_active())
my_cond_wait(&buf_pool.done_flush_list,
&buf_pool.flush_list_mutex.m_mutex);
else if (!os_aio_pending_writes())
break ;
mysql_mutex_unlock(&buf_pool.flush_list_mutex);
}
2023-07-27 12:51:59 0 [Note] InnoDB: End of log at LSN=284055215
2023-07-27 12:51:59 0 [Note] InnoDB: To recover: 198 pages
2023-07-27 12:51:59 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=218, page number=113]
2023-07-27 12:51:59 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
2023-07-27 12:51:59 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-07-27 12:51:59 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=218, page number=113]
2023-07-27 12:51:59 0 [Note] sorting 194,194
The kill+restart before this failure processed much fewer pages in the last batch:
2023-07-27 12:51:55 0 [Note] sorting 11,11
This means that the test is reproducing two independent bugs. I started one more campaign, hoping to produce an rr replay trace:
while ./mtr --rr=-h --parallel=60 innodb.recovery_memory{,,,,,,,,,}{,,,,,}; do :; done
Marko Mäkelä
added a comment - Unfortunately, also with the revised patch I got a failure:
innodb.recovery_memory 'innodb,release' w59 [ 53 fail ]
Test ended at 2023-07-27 12:52:00
CURRENT_TEST: innodb.recovery_memory
mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
…
2023-07-27 12:51:59 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=280364554
2023-07-27 12:51:59 0 [Note] InnoDB: End of log at LSN=284055215
2023-07-27 12:51:59 0 [Note] InnoDB: To recover: 198 pages
2023-07-27 12:51:59 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=218, page number=113]
2023-07-27 12:51:59 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
2023-07-27 12:51:59 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-07-27 12:51:59 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=218, page number=113]
2023-07-27 12:51:59 0 [Note] sorting 194,194
The kill+restart before this failure processed much fewer pages in the last batch:
2023-07-27 12:51:55 0 [Note] sorting 11,11
This means that the test is reproducing two independent bugs. I started one more campaign, hoping to produce an rr replay trace:
while . /mtr --rr=-h --parallel=60 innodb.recovery_memory{,,,,,,,,,}{,,,,,}; do :; done
2023-07-27 17:32:06 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=2398, page number=729743360], should be [page id: space=1, page number=556]
Marko Mäkelä
added a comment - I failed to reproduce this with rr so far. I got the idea to test with simulated asynchronous I/O:
. /mtr --parallel=100 --repeat=100 --mysqld=--innodb-use-native-aio=0 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,}
This did fail as well, so a bug like MDEV-29610 should not play a role.
innodb.recovery_memory 'innodb,release' w87 [ 62 fail ]
Test ended at 2023-07-27 17:32:06
CURRENT_TEST: innodb.recovery_memory
mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
…
2023-07-27 17:32:05 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=324014657
2023-07-27 17:32:05 0 [Note] InnoDB: Multi-batch recovery needed at LSN 324529178
2023-07-27 17:32:05 0 [Note] InnoDB: End of log at LSN=325255352
2023-07-27 17:32:05 0 [Note] InnoDB: To recover: LSN 324637270/325255352; 287 pages
2023-07-27 17:32:06 0 [Note] InnoDB: To recover: LSN 325135744/325255352; 383 pages
2023-07-27 17:32:06 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=2398, page number=729743360], should be [page id: space=1, page number=556]
I pushed a fix for the SIGSEGV issue (likely due to a data race with std::sort). The remaining failures will be addressed in MDEV-31791.
Marko Mäkelä
added a comment - I pushed a fix for the SIGSEGV issue (likely due to a data race with std::sort ). The remaining failures will be addressed in MDEV-31791 .
1 breakpoint keep y 0x00005581006f4a55 in std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > at /usr/include/c++/9/bits/stl_algo.h:1962
breakpoint already hit 1 time
2 breakpoint keep y <MULTIPLE>
breakpoint already hit 22 times
ignore next 9978 hits
2.1 y 0x00005581000b83d1 in buf_page_write_complete(IORequest const&) at /data/Server/bb-11.2-MDEV-14795_1E/storage/innobase/buf/buf0flu.cc:337
2.2 y 0x00005581007cf270 in buf_page_write_complete(IORequest const&) at /data/Server/bb-11.2-MDEV-14795_1E/storage/innobase/buf/buf0flu.cc:321
I observed one page for which oldest_modification_ was modified from 56630890 to 1 while the std::sort() was in progress.
Marko Mäkelä
added a comment - mleich was able to reproduce this in a branch where the fix had not been merged yet:
ssh pluto
rr replay /data/results/1690831257/TBR-2024/1/rr/latest-trace/
break std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >
continue
break buf_page_write_complete
ignore 2 1000
continue
info breakpoints
Thread 1 hit Breakpoint 1, std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > (__last=0x5581033f8720, __first=0x5581033f8670, __comp=...)
at /usr/include/c++/9/bits/stl_algo.h:4899
4899 std::__sort(__first, __last, __gnu_cxx::__ops::__iter_comp_iter(__comp));
(rr) break buf_page_write_complete
Breakpoint 2 at 0x5581000b83d1: buf_page_write_complete. (2 locations)
(rr) ign 2 10000
Will ignore next 10000 crossings of breakpoint 2.
(rr) continue
Continuing.
Thread 1 received signal SIGSEGV, Segmentation fault.
0x00005581006f1b82 in std::__atomic_base<unsigned long>::load (__m=std::memory_order_seq_cst, this=0xd1) at /usr/include/c++/9/bits/atomic_base.h:413
413 load(memory_order __m = memory_order_seq_cst) const noexcept
(rr) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x00005581006f4a55 in std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > at /usr/include/c++/9/bits/stl_algo.h:1962
breakpoint already hit 1 time
2 breakpoint keep y <MULTIPLE>
breakpoint already hit 22 times
ignore next 9978 hits
2.1 y 0x00005581000b83d1 in buf_page_write_complete(IORequest const&) at /data/Server/bb-11.2-MDEV-14795_1E/storage/innobase/buf/buf0flu.cc:337
2.2 y 0x00005581007cf270 in buf_page_write_complete(IORequest const&) at /data/Server/bb-11.2-MDEV-14795_1E/storage/innobase/buf/buf0flu.cc:321
I observed one page for which oldest_modification_ was modified from 56630890 to 1 while the std::sort() was in progress.
People
Marko Mäkelä
Marko Mäkelä
Votes:
1Vote for this issue
Watchers:
2Start watching this issue
Dates
Created:
Updated:
Resolved:
Git Integration
Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.
{"report":{"fcp":1118.1999998092651,"ttfb":271.59999990463257,"pageVisibility":"visible","entityId":122294,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":0.5,"journeyId":"b05622aa-27c2-4f18-a823-3fcdee887634","navigationType":0,"readyForUser":1216.8999996185303,"redirectCount":0,"resourceLoadedEnd":1722.6999998092651,"resourceLoadedStart":282.8999996185303,"resourceTiming":[{"duration":320.80000019073486,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":282.8999996185303,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":282.8999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":603.6999998092651,"responseStart":0,"secureConnectionStart":0},{"duration":318.5,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/2bf333562ca6724060a9d5f1535471f6/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true","startTime":285.59999990463257,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":285.59999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":604.0999999046326,"responseStart":0,"secureConnectionStart":0},{"duration":332.90000009536743,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":285.7999997138977,"connectEnd":285.7999997138977,"connectStart":285.7999997138977,"domainLookupEnd":285.7999997138977,"domainLookupStart":285.7999997138977,"fetchStart":285.7999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":285.7999997138977,"responseEnd":618.6999998092651,"responseStart":618.6999998092651,"secureConnectionStart":285.7999997138977},{"duration":390.69999980926514,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/2bf333562ca6724060a9d5f1535471f6/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true","startTime":286,"connectEnd":286,"connectStart":286,"domainLookupEnd":286,"domainLookupStart":286,"fetchStart":286,"redirectEnd":0,"redirectStart":0,"requestStart":286,"responseEnd":676.6999998092651,"responseStart":676.6999998092651,"secureConnectionStart":286},{"duration":394.80000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":286.19999980926514,"connectEnd":286.19999980926514,"connectStart":286.19999980926514,"domainLookupEnd":286.19999980926514,"domainLookupStart":286.19999980926514,"fetchStart":286.19999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":286.19999980926514,"responseEnd":681,"responseStart":681,"secureConnectionStart":286.19999980926514},{"duration":395.1000003814697,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":286.3999996185303,"connectEnd":286.3999996185303,"connectStart":286.3999996185303,"domainLookupEnd":286.3999996185303,"domainLookupStart":286.3999996185303,"fetchStart":286.3999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":286.3999996185303,"responseEnd":681.5,"responseStart":681.5,"secureConnectionStart":286.3999996185303},{"duration":395.19999980926514,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":286.59999990463257,"connectEnd":286.59999990463257,"connectStart":286.59999990463257,"domainLookupEnd":286.59999990463257,"domainLookupStart":286.59999990463257,"fetchStart":286.59999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":286.59999990463257,"responseEnd":681.7999997138977,"responseStart":681.7999997138977,"secureConnectionStart":286.59999990463257},{"duration":479.80000019073486,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":286.7999997138977,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":286.7999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":766.5999999046326,"responseStart":0,"secureConnectionStart":0},{"duration":395.5,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":286.8999996185303,"connectEnd":286.8999996185303,"connectStart":286.8999996185303,"domainLookupEnd":286.8999996185303,"domainLookupStart":286.8999996185303,"fetchStart":286.8999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":286.8999996185303,"responseEnd":682.3999996185303,"responseStart":682.3999996185303,"secureConnectionStart":286.8999996185303},{"duration":479.59999990463257,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":287.19999980926514,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":287.19999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":766.7999997138977,"responseStart":0,"secureConnectionStart":0},{"duration":395.80000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":287.19999980926514,"connectEnd":287.19999980926514,"connectStart":287.19999980926514,"domainLookupEnd":287.19999980926514,"domainLookupStart":287.19999980926514,"fetchStart":287.19999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":287.19999980926514,"responseEnd":683,"responseStart":682.8999996185303,"secureConnectionStart":287.19999980926514},{"duration":1284.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":288.5,"connectEnd":288.5,"connectStart":288.5,"domainLookupEnd":288.5,"domainLookupStart":288.5,"fetchStart":288.5,"redirectEnd":0,"redirectStart":0,"requestStart":288.5,"responseEnd":1573,"responseStart":1573,"secureConnectionStart":288.5},{"duration":1434.0999999046326,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":288.59999990463257,"connectEnd":288.59999990463257,"connectStart":288.59999990463257,"domainLookupEnd":288.59999990463257,"domainLookupStart":288.59999990463257,"fetchStart":288.59999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":288.59999990463257,"responseEnd":1722.6999998092651,"responseStart":1722.6999998092651,"secureConnectionStart":288.59999990463257},{"duration":795.2000002861023,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":778.8999996185303,"connectEnd":778.8999996185303,"connectStart":778.8999996185303,"domainLookupEnd":778.8999996185303,"domainLookupStart":778.8999996185303,"fetchStart":778.8999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":778.8999996185303,"responseEnd":1574.0999999046326,"responseStart":1574.0999999046326,"secureConnectionStart":778.8999996185303},{"duration":647.0999999046326,"initiatorType":"script","name":"https://www.google-analytics.com/analytics.js","startTime":1111.6999998092651,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":1111.6999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1758.7999997138977,"responseStart":0,"secureConnectionStart":0}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":66,"responseStart":272,"responseEnd":277,"domLoading":279,"domInteractive":1762,"domContentLoadedEventStart":1763,"domContentLoadedEventEnd":1820,"domComplete":2184,"loadEventStart":2184,"loadEventEnd":2185,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":1724.2999997138977},{"name":"bigPipe.sidebar-id.end","time":1725.2999997138977},{"name":"bigPipe.activity-panel-pipe-id.start","time":1725.3999996185303},{"name":"bigPipe.activity-panel-pipe-id.end","time":1729.3999996185303},{"name":"activityTabFullyLoaded","time":1853.5999999046326}],"measures":[],"correlationId":"bb30a24ff18a64","effectiveType":"4g","downlink":9,"rtt":0,"serverDuration":136,"dbReadsTimeInMs":21,"dbConnsTimeInMs":31,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}
Another failure with the above buggy patch:
CURRENT_TEST: innodb.recovery_memory
mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
…
2023-07-27 12:34:36 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=311491401
2023-07-27 12:34:36 0 [Note] InnoDB: End of log at LSN=315636465
2023-07-27 12:34:36 0 [Note] InnoDB: To recover: 279 pages
2023-07-27 12:34:36 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=239, page number=110]
2023-07-27 12:34:36 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
Hopefully this revised wait loop does the trick and ensures that no page writes are in progress while we sort the buf_pool.flush_list:
{
mysql_mutex_lock(&buf_pool.flush_list_mutex);
my_cond_wait(&buf_pool.done_flush_list,
&buf_pool.flush_list_mutex.m_mutex);
mysql_mutex_unlock(&buf_pool.flush_list_mutex);
}