Type:
Bug
Priority:
Critical
Resolution:
Fixed
Affects Version/s:
10.6 , 10.9(EOL) , 10.10(EOL) , 10.11 , 11.0(EOL) , 11.1(EOL) , 11.2(EOL)
This failure, which I got while testing MDEV-31350 , might be possible in 10.6 as well; I did not try it yet. I do not think that the one-line patch of MDEV-31350 plays any role here. The original error messages from the test run were:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
innodb.recovery_memory 'innodb,release' w49 [ 54 fail ]
Test ended at 2023-05-26 11:29:12
CURRENT_TEST: innodb.recovery_memory
mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
The server error log says:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
2023-05-26 11:29:11 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=293063978
2023-05-26 11:29:11 0 [Note] InnoDB: Multi-batch recovery needed at LSN 293582048
2023-05-26 11:29:11 0 [Note] InnoDB: End of log at LSN=294552683
2023-05-26 11:29:11 0 [Note] InnoDB: To recover: LSN 293785020/294552683; 347 pages
2023-05-26 11:29:11 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-05-26 11:29:11 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=216, page number=4]
When I tried start recover a copy of the data directory (in the attached data.tar.xz ) using a larger buffer pool (and a single recovery batch), I got a different error instead:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
2023-05-26 11:40:49 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=293063978
2023-05-26 11:40:49 0 [Note] InnoDB: End of log at LSN=294552683
2023-05-26 11:40:49 0 [Note] InnoDB: To recover: 151 pages
2023-05-26 11:40:49 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=216, page number=4]
2023-05-26 11:40:49 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
2023-05-26 11:40:49 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-05-26 11:40:49 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=216, page number=4]
The server would subsequently hang in the main recovery thread:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
#8 os_aio_wait_until_no_pending_reads (declare=<optimized out>) at /mariadb/11/storage/innobase/os/os0file.cc:3695
#9 0x00005620d562ba59 in recv_sys_t::wait_for_pool (this=this@entry=0x5620d594e500 <recv_sys>, pages=pages@entry=0) at /mariadb/11/storage/innobase/log/log0recv.cc:2047
I suppose that the real reason for the hang is the infinite loop in the thread that reported the corruption:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
#1 LF_BACKOFF () at /mariadb/11/include/my_cpu.h:125
#2 buf_pool_t::corrupted_evict (...) at /mariadb/11/storage/innobase/buf/buf0lru.cc:1247
#3 0x00005620d562662d in recv_recover_page (...) at /mariadb/11/storage/innobase/log/log0recv.cc:3290
#4 0x00005620d56267f4 in IORequest::fake_read_complete (this=0x7f6f4c331308, offset=94699053504392) at /mariadb/11/storage/innobase/log/log0recv.cc:3467
#5 0x00005620d5640a97 in fake_io_callback (c=0x7f6f4c331290) at /mariadb/11/storage/innobase/os/os0file.cc:3434
In the data directory, page 4 of the file data/test/t1.ibd is filled with NUL bytes, so the page should have been recoverable by applying the log records. In the ib_logfile0 there are two checkpoint blocks present: at offset 0x2000 for the latest checkpoint LSN 293063978, and at 0x1000 for an earlier checkpoint LSN 289089329. If I corrupt the latest checkpoint:
dd if=/dev/zero bs=4096 seek=2 count=1 conv=notrunc of=data/ib_logfile0
sql/mariadbd --datadir "$(pwd)/data"
the recovery will start from the earlier checkpoint LSN and fail in the same way:
10.9 44c9008ba65686abf1c82c9166255a8c52d61f74
2023-05-26 12:07:18 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=289089329
2023-05-26 12:07:18 0 [Note] InnoDB: End of log at LSN=294552683
2023-05-26 12:07:18 0 [Note] InnoDB: To recover: 217 pages
2023-05-26 12:07:18 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=216, page number=4]
2023-05-26 12:07:18 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
2023-05-26 12:07:18 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
2023-05-26 12:07:18 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=216, page number=4]
This would seem to suggest that the problem is not a race condition around writing log checkpoints but instead something that MDEV-18976 should be able to catch. It could be worthwhile to enable that logic in a non-debug build:
diff --git a/storage/innobase/mtr/mtr0mtr.cc b/storage/innobase/mtr/mtr0mtr.cc
index fb08511126a..de74c5658f3 100644
--- a/storage/innobase/mtr/mtr0mtr.cc
+++ b/storage/innobase/mtr/mtr0mtr.cc
@@ -1020,7 +1020,7 @@ std::pair<lsn_t,mtr_t::page_flush_ahead> mtr_t::do_write()
ut_ad(!m_latch_ex || log_sys.latch.is_write_locked());
#endif
-#ifndef DBUG_OFF
+#if 1
do
{
if (m_log_mode != MTR_LOG_ALL)
What needs to be fixed in this ticket is the InnoDB hang after corruption was reported.
Marko Mäkelä
made changes -
2023-05-26 13:05
Fix Version/s
10.6
[ 24028
]
Affects Version/s
10.6
[ 24028
]
Labels
hang recovery
corruption hang recovery
Marko Mäkelä
made changes -
2023-05-26 14:04
Status
Open
[ 1
]
In Progress
[ 3
]
Marko Mäkelä
made changes -
2023-06-02 07:16
Fix Version/s
10.6.15
[ 29013
]
Fix Version/s
10.9.8
[ 29015
]
Fix Version/s
10.10.6
[ 29017
]
Fix Version/s
10.11.5
[ 29019
]
Fix Version/s
11.0.3
[ 28920
]
Fix Version/s
11.1.2
[ 28921
]
Fix Version/s
10.6
[ 24028
]
Fix Version/s
10.9
[ 26905
]
Fix Version/s
10.10
[ 27530
]
Fix Version/s
10.11
[ 27614
]
Fix Version/s
11.0
[ 28320
]
Fix Version/s
11.1
[ 28549
]
Fix Version/s
11.2
[ 28603
]
Resolution
Fixed
[ 1
]
Status
In Progress
[ 3
]
Closed
[ 6
]
{"report":{"fcp":1547.8999999761581,"ttfb":474.89999997615814,"pageVisibility":"visible","entityId":122291,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":0.5,"journeyId":"b4097a91-87f5-4d89-a997-0b31c86efeba","navigationType":0,"readyForUser":1652.2000000476837,"redirectCount":0,"resourceLoadedEnd":2128.399999976158,"resourceLoadedStart":480.39999997615814,"resourceTiming":[{"duration":354,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2bu7/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":480.39999997615814,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":480.39999997615814,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":834.3999999761581,"responseStart":0,"secureConnectionStart":0},{"duration":354.2000000476837,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2bu7/820016/12ta74/8679b4946efa1a0bb029a3a22206fb5d/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true","startTime":480.60000002384186,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":480.60000002384186,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":834.8000000715256,"responseStart":0,"secureConnectionStart":0},{"duration":410.60000002384186,"initiatorType":"script","name":"https://jira.mariadb.org/s/fbf975c0cce4b1abf04784eeae9ba1f4-CDN/lu2bu7/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":480.89999997615814,"connectEnd":480.89999997615814,"connectStart":480.89999997615814,"domainLookupEnd":480.89999997615814,"domainLookupStart":480.89999997615814,"fetchStart":480.89999997615814,"redirectEnd":0,"redirectStart":0,"requestStart":480.89999997615814,"responseEnd":891.5,"responseStart":891.5,"secureConnectionStart":480.89999997615814},{"duration":753.1000000238419,"initiatorType":"script","name":"https://jira.mariadb.org/s/099b33461394b8015fc36c0a4b96e19f-CDN/lu2bu7/820016/12ta74/8679b4946efa1a0bb029a3a22206fb5d/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true","startTime":481,"connectEnd":481,"connectStart":481,"domainLookupEnd":481,"domainLookupStart":481,"fetchStart":481,"redirectEnd":0,"redirectStart":0,"requestStart":481,"responseEnd":1234.1000000238419,"responseStart":1234.1000000238419,"secureConnectionStart":481},{"duration":756.8999999761581,"initiatorType":"script","name":"https://jira.mariadb.org/s/94c15bff32baef80f4096a08aceae8bc-CDN/lu2bu7/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":481.2000000476837,"connectEnd":481.2000000476837,"connectStart":481.2000000476837,"domainLookupEnd":481.2000000476837,"domainLookupStart":481.2000000476837,"fetchStart":481.2000000476837,"redirectEnd":0,"redirectStart":0,"requestStart":481.2000000476837,"responseEnd":1238.1000000238419,"responseStart":1238,"secureConnectionStart":481.2000000476837},{"duration":757.1000000238419,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2bu7/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":481.39999997615814,"connectEnd":481.39999997615814,"connectStart":481.39999997615814,"domainLookupEnd":481.39999997615814,"domainLookupStart":481.39999997615814,"fetchStart":481.39999997615814,"redirectEnd":0,"redirectStart":0,"requestStart":481.39999997615814,"responseEnd":1238.5,"responseStart":1238.5,"secureConnectionStart":481.39999997615814},{"duration":757.2999999523163,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2bu7/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":481.60000002384186,"connectEnd":481.60000002384186,"connectStart":481.60000002384186,"domainLookupEnd":481.60000002384186,"domainLookupStart":481.60000002384186,"fetchStart":481.60000002384186,"redirectEnd":0,"redirectStart":0,"requestStart":481.60000002384186,"responseEnd":1238.8999999761581,"responseStart":1238.8999999761581,"secureConnectionStart":481.60000002384186},{"duration":758.1000000238419,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2bu7/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":481.7000000476837,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":481.7000000476837,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1239.8000000715256,"responseStart":0,"secureConnectionStart":0},{"duration":757.5,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":481.89999997615814,"connectEnd":481.89999997615814,"connectStart":481.89999997615814,"domainLookupEnd":481.89999997615814,"domainLookupStart":481.89999997615814,"fetchStart":481.89999997615814,"redirectEnd":0,"redirectStart":0,"requestStart":481.89999997615814,"responseEnd":1239.3999999761581,"responseStart":1239.3999999761581,"secureConnectionStart":481.89999997615814},{"duration":758.6000000238419,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2bu7/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":482,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":482,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1240.6000000238419,"responseStart":0,"secureConnectionStart":0},{"duration":758.6000000238419,"initiatorType":"script","name":"https://jira.mariadb.org/s/3339d87fa2538a859872f2df449bf8d0-CDN/lu2bu7/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":482.10000002384186,"connectEnd":482.10000002384186,"connectStart":482.10000002384186,"domainLookupEnd":482.10000002384186,"domainLookupStart":482.10000002384186,"fetchStart":482.10000002384186,"redirectEnd":0,"redirectStart":0,"requestStart":482.10000002384186,"responseEnd":1240.7000000476837,"responseStart":1240.7000000476837,"secureConnectionStart":482.10000002384186},{"duration":1074,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2bu7/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":487.89999997615814,"connectEnd":487.89999997615814,"connectStart":487.89999997615814,"domainLookupEnd":487.89999997615814,"domainLookupStart":487.89999997615814,"fetchStart":487.89999997615814,"redirectEnd":0,"redirectStart":0,"requestStart":487.89999997615814,"responseEnd":1561.8999999761581,"responseStart":1561.8999999761581,"secureConnectionStart":487.89999997615814},{"duration":1640,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2bu7/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":488.39999997615814,"connectEnd":488.39999997615814,"connectStart":488.39999997615814,"domainLookupEnd":488.39999997615814,"domainLookupStart":488.39999997615814,"fetchStart":488.39999997615814,"redirectEnd":0,"redirectStart":0,"requestStart":488.39999997615814,"responseEnd":2128.399999976158,"responseStart":2128.399999976158,"secureConnectionStart":488.39999997615814},{"duration":492.40000009536743,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":1039.3999999761581,"connectEnd":1039.3999999761581,"connectStart":1039.3999999761581,"domainLookupEnd":1039.3999999761581,"domainLookupStart":1039.3999999761581,"fetchStart":1039.3999999761581,"redirectEnd":0,"redirectStart":0,"requestStart":1039.3999999761581,"responseEnd":1531.8000000715256,"responseStart":1531.8000000715256,"secureConnectionStart":1039.3999999761581},{"duration":673,"initiatorType":"script","name":"https://www.google-analytics.com/analytics.js","startTime":1540.3999999761581,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":1540.3999999761581,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":2213.399999976158,"responseStart":0,"secureConnectionStart":0},{"duration":650.8999999761581,"initiatorType":"link","name":"https://jira.mariadb.org/s/d5715adaadd168a9002b108b2b039b50-CDN/lu2bu7/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/css/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true","startTime":1562.1000000238419,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":1562.1000000238419,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":2213,"responseStart":0,"secureConnectionStart":0},{"duration":648.2000000476837,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2bu7/820016/12ta74/e65b778d185daf5aee24936755b43da6/_/download/contextbatch/js/browser-metrics-plugin.contrib,-_super,-project.issue.navigator,-jira.view.issue,-atl.general/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true","startTime":1563,"connectEnd":1563,"connectStart":1563,"domainLookupEnd":1563,"domainLookupStart":1563,"fetchStart":1563,"redirectEnd":0,"redirectStart":0,"requestStart":1563,"responseEnd":2211.2000000476837,"responseStart":2211.2000000476837,"secureConnectionStart":1563},{"duration":654,"initiatorType":"script","name":"https://jira.mariadb.org/s/f51ef5507eea4c158f257c66c93b2a3f-CDN/lu2bu7/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/js/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true","startTime":1563.7000000476837,"connectEnd":1563.7000000476837,"connectStart":1563.7000000476837,"domainLookupEnd":1563.7000000476837,"domainLookupStart":1563.7000000476837,"fetchStart":1563.7000000476837,"redirectEnd":0,"redirectStart":0,"requestStart":1563.7000000476837,"responseEnd":2217.7000000476837,"responseStart":2217.7000000476837,"secureConnectionStart":1563.7000000476837}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":223,"responseStart":475,"responseEnd":488,"domLoading":479,"domInteractive":2163,"domContentLoadedEventStart":2164,"domContentLoadedEventEnd":2209,"domComplete":3071,"loadEventStart":3071,"loadEventEnd":3071,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":2141.7000000476837},{"name":"bigPipe.sidebar-id.end","time":2142.5},{"name":"bigPipe.activity-panel-pipe-id.start","time":2142.7000000476837},{"name":"bigPipe.activity-panel-pipe-id.end","time":2145.7000000476837},{"name":"activityTabFullyLoaded","time":2226.2000000476837}],"measures":[],"correlationId":"c34a8f2ae48a21","effectiveType":"4g","downlink":9.5,"rtt":0,"serverDuration":148,"dbReadsTimeInMs":26,"dbConnsTimeInMs":36,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}