Hello,
Last night right after midnight one of my servers seemed to hang. All (or at least a great many?) queries would hang forever in the Execute phase. Of course this hung up the entire Galera cluster as well.
I was awakened by an alert around 12:40am. My innodb_fatal_semaphore_wait_threshold is set to 32 seconds, so this hang was not caught by that watchdog.
At 12:49 I was able to send MariaDB a signal which caused it to crash and dump core. So I do have a stack trace for this situation which I don't want to post publicly, but which is available upon request. I also have SHOW PROCESSLIST logs for much of the time in case that helps.
If a MariaDB expert could take a look at the situation I would appreciate it! Thanks.
- duplicates
-
MDEV-32371
Deadlock between buf_page_get_zip() and buf_pool_t::corrupted_evict() on InnoDB ROW_FORMAT=COMPRESSED table corruption
-
-
Closed
{"report":{"fcp":1082.1000003814697,"ttfb":389.4000005722046,"pageVisibility":"visible","entityId":126673,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":0.5,"journeyId":"26267752-b550-467b-b11f-9ca0bd38e446","navigationType":0,"readyForUser":1156.6999998092651,"redirectCount":0,"resourceLoadedEnd":1194.9000005722046,"resourceLoadedStart":395,"resourceTiming":[{"duration":107.90000057220459,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":395,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":395,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":502.9000005722046,"responseStart":0,"secureConnectionStart":0},{"duration":108.39999961853027,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":395.30000019073486,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":395.30000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":503.69999980926514,"responseStart":0,"secureConnectionStart":0},{"duration":166.29999923706055,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":395.4000005722046,"connectEnd":395.4000005722046,"connectStart":395.4000005722046,"domainLookupEnd":395.4000005722046,"domainLookupStart":395.4000005722046,"fetchStart":395.4000005722046,"redirectEnd":0,"redirectStart":0,"requestStart":395.4000005722046,"responseEnd":561.6999998092651,"responseStart":561.6999998092651,"secureConnectionStart":395.4000005722046},{"duration":224.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":395.6000003814697,"connectEnd":395.6000003814697,"connectStart":395.6000003814697,"domainLookupEnd":395.6000003814697,"domainLookupStart":395.6000003814697,"fetchStart":395.6000003814697,"redirectEnd":0,"redirectStart":0,"requestStart":395.6000003814697,"responseEnd":620.1000003814697,"responseStart":620.1000003814697,"secureConnectionStart":395.6000003814697},{"duration":228.10000038146973,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":395.80000019073486,"connectEnd":395.80000019073486,"connectStart":395.80000019073486,"domainLookupEnd":395.80000019073486,"domainLookupStart":395.80000019073486,"fetchStart":395.80000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":395.80000019073486,"responseEnd":623.9000005722046,"responseStart":623.9000005722046,"secureConnectionStart":395.80000019073486},{"duration":228.30000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":396.1000003814697,"connectEnd":396.1000003814697,"connectStart":396.1000003814697,"domainLookupEnd":396.1000003814697,"domainLookupStart":396.1000003814697,"fetchStart":396.1000003814697,"redirectEnd":0,"redirectStart":0,"requestStart":396.1000003814697,"responseEnd":624.4000005722046,"responseStart":624.4000005722046,"secureConnectionStart":396.1000003814697},{"duration":228.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":396.19999980926514,"connectEnd":396.19999980926514,"connectStart":396.19999980926514,"domainLookupEnd":396.19999980926514,"domainLookupStart":396.19999980926514,"fetchStart":396.19999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":396.19999980926514,"responseEnd":624.6999998092651,"responseStart":624.6999998092651,"secureConnectionStart":396.19999980926514},{"duration":308.69999980926514,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":396.30000019073486,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":396.30000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":705,"responseStart":0,"secureConnectionStart":0},{"duration":228.5,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":396.6000003814697,"connectEnd":396.6000003814697,"connectStart":396.6000003814697,"domainLookupEnd":396.6000003814697,"domainLookupStart":396.6000003814697,"fetchStart":396.6000003814697,"redirectEnd":0,"redirectStart":0,"requestStart":396.6000003814697,"responseEnd":625.1000003814697,"responseStart":625.1000003814697,"secureConnectionStart":396.6000003814697},{"duration":308.3999996185303,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":396.80000019073486,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":396.80000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":705.1999998092651,"responseStart":0,"secureConnectionStart":0},{"duration":228.79999923706055,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":396.9000005722046,"connectEnd":396.9000005722046,"connectStart":396.9000005722046,"domainLookupEnd":396.9000005722046,"domainLookupStart":396.9000005722046,"fetchStart":396.9000005722046,"redirectEnd":0,"redirectStart":0,"requestStart":396.9000005722046,"responseEnd":625.6999998092651,"responseStart":625.6999998092651,"secureConnectionStart":396.9000005722046},{"duration":344.3999996185303,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":397.80000019073486,"connectEnd":397.80000019073486,"connectStart":397.80000019073486,"domainLookupEnd":397.80000019073486,"domainLookupStart":397.80000019073486,"fetchStart":397.80000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":397.80000019073486,"responseEnd":742.1999998092651,"responseStart":742.1999998092651,"secureConnectionStart":397.80000019073486},{"duration":410.8999996185303,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":402.30000019073486,"connectEnd":402.30000019073486,"connectStart":402.30000019073486,"domainLookupEnd":402.30000019073486,"domainLookupStart":402.30000019073486,"fetchStart":402.30000019073486,"redirectEnd":0,"redirectStart":0,"requestStart":402.30000019073486,"responseEnd":813.1999998092651,"responseStart":813.1999998092651,"secureConnectionStart":402.30000019073486},{"duration":79.5,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":716,"connectEnd":716,"connectStart":716,"domainLookupEnd":716,"domainLookupStart":716,"fetchStart":716,"redirectEnd":0,"redirectStart":0,"requestStart":716,"responseEnd":795.5,"responseStart":795.5,"secureConnectionStart":716},{"duration":86.89999961853027,"initiatorType":"link","name":"https://jira.mariadb.org/s/d5715adaadd168a9002b108b2b039b50-CDN/lu2cib/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/css/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":988.8000001907349,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":988.8000001907349,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1075.6999998092651,"responseStart":0,"secureConnectionStart":0},{"duration":86.5,"initiatorType":"link","name":"https://jira.mariadb.org/s/50bc9be5bfead1a25e72c1a9338c94f6-CDN/lu2cib/820016/12ta74/e108c7645258ccb43280ed3404e3e949/_/download/contextbatch/css/com.atlassian.jira.plugins.jira-development-integration-plugin:0,-_super,-jira.view.issue,-jira.global,-jira.general,-jira.browse.project,-project.issue.navigator,-atl.general/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":989.3000001907349,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":989.3000001907349,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1075.8000001907349,"responseStart":0,"secureConnectionStart":0},{"duration":198.39999961853027,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/e65b778d185daf5aee24936755b43da6/_/download/contextbatch/js/browser-metrics-plugin.contrib,-_super,-project.issue.navigator,-jira.view.issue,-atl.general/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":990.1000003814697,"connectEnd":990.1000003814697,"connectStart":990.1000003814697,"domainLookupEnd":990.1000003814697,"domainLookupStart":990.1000003814697,"fetchStart":990.1000003814697,"redirectEnd":0,"redirectStart":0,"requestStart":990.1000003814697,"responseEnd":1188.5,"responseStart":1188.5,"secureConnectionStart":990.1000003814697},{"duration":203.80000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/e0bf5781d46ea69fb123572974cf39de-CDN/lu2cib/820016/12ta74/e108c7645258ccb43280ed3404e3e949/_/download/contextbatch/js/com.atlassian.jira.plugins.jira-development-integration-plugin:0,-_super,-jira.view.issue,-jira.global,-jira.general,-jira.browse.project,-project.issue.navigator,-atl.general/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":991.1000003814697,"connectEnd":991.1000003814697,"connectStart":991.1000003814697,"domainLookupEnd":991.1000003814697,"domainLookupStart":991.1000003814697,"fetchStart":991.1000003814697,"redirectEnd":0,"redirectStart":0,"requestStart":991.1000003814697,"responseEnd":1194.9000005722046,"responseStart":1194.9000005722046,"secureConnectionStart":991.1000003814697},{"duration":149,"initiatorType":"script","name":"https://www.google-analytics.com/analytics.js","startTime":1075,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":1075,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1224,"responseStart":0,"secureConnectionStart":0},{"duration":202.89999961853027,"initiatorType":"script","name":"https://jira.mariadb.org/s/097ae97cb8fbec7d6ea4bbb1f26955b9-CDN/lu2cib/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/js/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":990.6000003814697,"connectEnd":990.6000003814697,"connectStart":990.6000003814697,"domainLookupEnd":990.6000003814697,"domainLookupStart":990.6000003814697,"fetchStart":990.6000003814697,"redirectEnd":0,"redirectStart":0,"requestStart":990.6000003814697,"responseEnd":1193.5,"responseStart":1193.5,"secureConnectionStart":990.6000003814697}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":215,"responseStart":389,"responseEnd":393,"domLoading":392,"domInteractive":1230,"domContentLoadedEventStart":1230,"domContentLoadedEventEnd":1286,"domComplete":1435,"loadEventStart":1436,"loadEventEnd":1436,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":1193.6000003814697},{"name":"bigPipe.sidebar-id.end","time":1194.4000005722046},{"name":"bigPipe.activity-panel-pipe-id.start","time":1194.6000003814697},{"name":"bigPipe.activity-panel-pipe-id.end","time":1197.6999998092651},{"name":"activityTabFullyLoaded","time":1305.1999998092651}],"measures":[],"correlationId":"da7479c69445a5","effectiveType":"4g","downlink":9.2,"rtt":0,"serverDuration":118,"dbReadsTimeInMs":20,"dbConnsTimeInMs":29,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}
I see that dict_sys.freeze() invokes srw_lock::rd_lock(), which in the futex-based implementation would temporarily escalate the lock in rd_wait(). These waits are not covered by the watchdog that is implemented in dict_sys_t::lock_wait(). It is not trivial to change this to use the watchdog, because we have multiple different implementations here: the futex-based one on Linux and various BSDs (
MDEV-26476), SRWLOCK on Microsoft Windows, and something that is wrapped by rw_lock_t on other platforms. An attempt to implement the watchdog for dict_sys.freeze() could reduce performance in some workloads. Besides, for this particular hang it would not help at all, because none of the threads are waiting for dict_sys.latch.When it comes to the root cause of this corruption, I am a bit puzzled. Based on MDEV-32115 it seems that the default wsrep_sst_method=rsync should work reliably on 10.5 and later releases. Perhaps this would better be explained by
MDEV-32174, which is a bug in ROLLBACK on ROW_FORMAT=COMPRESSED tables. But, I do not yet see how it could make individual pages look as corrupted when they are being read into the buffer pool.