Type:
Bug
Priority:
Critical
Resolution:
Fixed
Affects Version/s:
10.6 , 10.9(EOL) , 10.10(EOL) , 10.11 , 11.0(EOL) , 11.1(EOL) , 11.2(EOL)
This failure, which I got while testing MDEV-31350 , might be possible in 10.6 as well; I did not try it yet. I do not think that the one-line patch of MDEV-31350 plays any role here. The original error messages from the test run were:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
innodb.recovery_memory 'innodb,release' w49 [ 54 fail ]
Test ended at 2023-05-26 11:29:12
CURRENT_TEST: innodb.recovery_memory
mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
The server error log says:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
2023-05-26 11:29:11 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=293063978
2023-05-26 11:29:11 0 [Note] InnoDB: Multi-batch recovery needed at LSN 293582048
2023-05-26 11:29:11 0 [Note] InnoDB: End of log at LSN=294552683
2023-05-26 11:29:11 0 [Note] InnoDB: To recover: LSN 293785020/294552683; 347 pages
2023-05-26 11:29:11 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-05-26 11:29:11 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=216, page number=4]
When I tried start recover a copy of the data directory (in the attached data.tar.xz ) using a larger buffer pool (and a single recovery batch), I got a different error instead:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
2023-05-26 11:40:49 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=293063978
2023-05-26 11:40:49 0 [Note] InnoDB: End of log at LSN=294552683
2023-05-26 11:40:49 0 [Note] InnoDB: To recover: 151 pages
2023-05-26 11:40:49 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=216, page number=4]
2023-05-26 11:40:49 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
2023-05-26 11:40:49 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-05-26 11:40:49 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=216, page number=4]
The server would subsequently hang in the main recovery thread:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
#8 os_aio_wait_until_no_pending_reads (declare=<optimized out>) at /mariadb/11/storage/innobase/os/os0file.cc:3695
#9 0x00005620d562ba59 in recv_sys_t::wait_for_pool (this=this@entry=0x5620d594e500 <recv_sys>, pages=pages@entry=0) at /mariadb/11/storage/innobase/log/log0recv.cc:2047
I suppose that the real reason for the hang is the infinite loop in the thread that reported the corruption:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
#1 LF_BACKOFF () at /mariadb/11/include/my_cpu.h:125
#2 buf_pool_t::corrupted_evict (...) at /mariadb/11/storage/innobase/buf/buf0lru.cc:1247
#3 0x00005620d562662d in recv_recover_page (...) at /mariadb/11/storage/innobase/log/log0recv.cc:3290
#4 0x00005620d56267f4 in IORequest::fake_read_complete (this=0x7f6f4c331308, offset=94699053504392) at /mariadb/11/storage/innobase/log/log0recv.cc:3467
#5 0x00005620d5640a97 in fake_io_callback (c=0x7f6f4c331290) at /mariadb/11/storage/innobase/os/os0file.cc:3434
In the data directory, page 4 of the file data/test/t1.ibd is filled with NUL bytes, so the page should have been recoverable by applying the log records. In the ib_logfile0 there are two checkpoint blocks present: at offset 0x2000 for the latest checkpoint LSN 293063978, and at 0x1000 for an earlier checkpoint LSN 289089329. If I corrupt the latest checkpoint:
dd if=/dev/zero bs=4096 seek=2 count=1 conv=notrunc of=data/ib_logfile0
sql/mariadbd --datadir "$(pwd)/data"
the recovery will start from the earlier checkpoint LSN and fail in the same way:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
2023-05-26 12:07:18 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=289089329
2023-05-26 12:07:18 0 [Note] InnoDB: End of log at LSN=294552683
2023-05-26 12:07:18 0 [Note] InnoDB: To recover: 217 pages
2023-05-26 12:07:18 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=216, page number=4]
2023-05-26 12:07:18 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
2023-05-26 12:07:18 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-05-26 12:07:18 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=216, page number=4]
This would seem to suggest that the problem is not a race condition around writing log checkpoints but instead something that MDEV-18976 should be able to catch. It could be worthwhile to enable that logic in a non-debug build:
diff --git a/storage/innobase/mtr/mtr0mtr.cc b/storage/innobase/mtr/mtr0mtr.cc
index fb08511126a..de74c5658f3 100644
--- a/storage/innobase/mtr/mtr0mtr.cc
+++ b/storage/innobase/mtr/mtr0mtr.cc
@@ -1020,7 +1020,7 @@ std::pair<lsn_t,mtr_t::page_flush_ahead> mtr_t::do_write()
ut_ad(!m_latch_ex || log_sys.latch.is_write_locked());
#endif
-#ifndef DBUG_OFF
+#if 1
do
{
if (m_log_mode != MTR_LOG_ALL)
What needs to be fixed in this ticket is the InnoDB hang after corruption was reported.
Transition
Time In Source Status
Execution Times
Open
In Progress
4h 51m
1
In Progress
Closed
6d 17h 12m
1
{"report":{"fcp":1172,"ttfb":265.0999994277954,"pageVisibility":"visible","entityId":122291,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":0.5,"journeyId":"129d983c-1823-417e-acd0-1cc800d6fdfc","navigationType":0,"readyForUser":1318.5,"redirectCount":0,"resourceLoadedEnd":1604.1999998092651,"resourceLoadedStart":271,"resourceTiming":[{"duration":337.0999994277954,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":271,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":271,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":608.0999994277954,"responseStart":0,"secureConnectionStart":0},{"duration":337,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":271.3999996185303,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":271.3999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":608.3999996185303,"responseStart":0,"secureConnectionStart":0},{"duration":346.3999996185303,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":271.5,"connectEnd":271.5,"connectStart":271.5,"domainLookupEnd":271.5,"domainLookupStart":271.5,"fetchStart":271.5,"redirectEnd":0,"redirectStart":0,"requestStart":271.5,"responseEnd":617.8999996185303,"responseStart":617.8999996185303,"secureConnectionStart":271.5},{"duration":388.80000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":271.69999980926514,"connectEnd":271.69999980926514,"connectStart":271.69999980926514,"domainLookupEnd":271.69999980926514,"domainLookupStart":271.69999980926514,"fetchStart":271.69999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":271.69999980926514,"responseEnd":660.5,"responseStart":660.5,"secureConnectionStart":271.69999980926514},{"duration":392.69999980926514,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":272,"connectEnd":272,"connectStart":272,"domainLookupEnd":272,"domainLookupStart":272,"fetchStart":272,"redirectEnd":0,"redirectStart":0,"requestStart":272,"responseEnd":664.6999998092651,"responseStart":664.6999998092651,"secureConnectionStart":272},{"duration":393,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":272.19999980926514,"connectEnd":272.19999980926514,"connectStart":272.19999980926514,"domainLookupEnd":272.19999980926514,"domainLookupStart":272.19999980926514,"fetchStart":272.19999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":272.19999980926514,"responseEnd":665.1999998092651,"responseStart":665.1999998092651,"secureConnectionStart":272.19999980926514},{"duration":393.19999980926514,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":272.30000019073486,"connectEnd":272.30000019073486,"connectStart":272.30000019073486,"domainLookupEnd":272.30000019073486,"domainLookupStart":272.30000019073486,"fetchStart":272.30000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":272.30000019073486,"responseEnd":665.5,"responseStart":665.5,"secureConnectionStart":272.30000019073486},{"duration":498.69999980926514,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":272.5,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":272.5,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":771.1999998092651,"responseStart":0,"secureConnectionStart":0},{"duration":393.5,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":272.80000019073486,"connectEnd":272.80000019073486,"connectStart":272.80000019073486,"domainLookupEnd":272.80000019073486,"domainLookupStart":272.80000019073486,"fetchStart":272.80000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":272.80000019073486,"responseEnd":666.3000001907349,"responseStart":666.3000001907349,"secureConnectionStart":272.80000019073486},{"duration":498.20000076293945,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":273.0999994277954,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":273.0999994277954,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":771.3000001907349,"responseStart":0,"secureConnectionStart":0},{"duration":393.5999994277954,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":273.30000019073486,"connectEnd":273.30000019073486,"connectStart":273.30000019073486,"domainLookupEnd":273.30000019073486,"domainLookupStart":273.30000019073486,"fetchStart":273.30000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":273.30000019073486,"responseEnd":666.8999996185303,"responseStart":666.8999996185303,"secureConnectionStart":273.30000019073486},{"duration":595.0999994277954,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":279.30000019073486,"connectEnd":279.30000019073486,"connectStart":279.30000019073486,"domainLookupEnd":279.30000019073486,"domainLookupStart":279.30000019073486,"fetchStart":279.30000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":279.30000019073486,"responseEnd":874.3999996185303,"responseStart":874.3999996185303,"secureConnectionStart":279.30000019073486},{"duration":1324.8000001907349,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":279.3999996185303,"connectEnd":279.3999996185303,"connectStart":279.3999996185303,"domainLookupEnd":279.3999996185303,"domainLookupStart":279.3999996185303,"fetchStart":279.3999996185303,"redirectEnd":0,"redirectStart":0,"requestStart":279.3999996185303,"responseEnd":1604.1999998092651,"responseStart":1604.1999998092651,"secureConnectionStart":279.3999996185303},{"duration":132.5,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":807.1999998092651,"connectEnd":807.1999998092651,"connectStart":807.1999998092651,"domainLookupEnd":807.1999998092651,"domainLookupStart":807.1999998092651,"fetchStart":807.1999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":807.1999998092651,"responseEnd":939.6999998092651,"responseStart":939.6999998092651,"secureConnectionStart":807.1999998092651},{"duration":532,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/e65b778d185daf5aee24936755b43da6/_/download/contextbatch/js/browser-metrics-plugin.contrib,-_super,-atl.general/batch.js?agile_global_admin_condition=true&jag=true&slack-enabled=true&whisper-enabled=true","startTime":1111.1999998092651,"connectEnd":1111.1999998092651,"connectStart":1111.1999998092651,"domainLookupEnd":1111.1999998092651,"domainLookupStart":1111.1999998092651,"fetchStart":1111.1999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":1111.1999998092651,"responseEnd":1643.1999998092651,"responseStart":1643.1999998092651,"secureConnectionStart":1111.1999998092651},{"duration":526.7999992370605,"initiatorType":"script","name":"https://www.google-analytics.com/analytics.js","startTime":1163.8000001907349,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":1163.8000001907349,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1690.5999994277954,"responseStart":0,"secureConnectionStart":0}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":59,"responseStart":265,"responseEnd":276,"domLoading":269,"domInteractive":1624,"domContentLoadedEventStart":1624,"domContentLoadedEventEnd":1681,"domComplete":2645,"loadEventStart":2645,"loadEventEnd":2645,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":1606.3000001907349},{"name":"bigPipe.sidebar-id.end","time":1607.0999994277954},{"name":"bigPipe.activity-panel-pipe-id.start","time":1607.3000001907349},{"name":"bigPipe.activity-panel-pipe-id.end","time":1608.3999996185303},{"name":"activityTabFullyLoaded","time":1701.3999996185303}],"measures":[],"correlationId":"2822b203c57ae6","effectiveType":"4g","downlink":9.9,"rtt":0,"serverDuration":121,"dbReadsTimeInMs":24,"dbConnsTimeInMs":36,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}