Type:
Task
Priority:
Major
Resolution:
Fixed
Affects Version/s:
None
Problem
In MXS-5187 a SIGABRT was sent due to a potential systemd watchdog timeout. The stacktraces were a bit suspicious so it's theoretically possible for this to be something else.
To be more certain of the source of a SIGABRT, MaxScale should somehow figure out if the SIGABRT relates to a watchdog timeout or something else.
Solutions
Scan systemd journal for messages
This solution would give us with a 100% certainty an answer to the question. The problem with this approach is that due to the problems described in MXS-5196 , the messages cannot be read without being a part of the systemd-journal group. This could be added but given that MaxScale usually does not need this (maxlog=1 is the default), it, for the time being, is better left to the end user to choose whether to allow MaxScale to read the log.
Track when the last watchdog notification was sent
The notification interval and the time when the last notification was sent is known by MaxScale. By logging this information in the signal handler, we'd be able to tell with a high likelihood whether the SIGABRT was due to a watchdog timeout simply by looking at when the last notification was sent and how often they should be sent. Since the notifications are sent twice as often as are needed, the difference in times should be very obvious.
markus makela
made changes -
2024-08-05 11:20
Field
Original Value
New Value
Assignee
markus makela
[ <markus.makela
]
markus makela
made changes -
2024-08-05 12:03
Summary
Scan systemd journal for watchdog timeout for SIGABRT
Scan systemd journal on SIGABRT for watchdog timeout
markus makela
made changes -
2024-08-09 06:35
Description
In MXS-5187 a SIGABRT was sent
In MXS-5187 a SIGABRT was sent due to a potential systemd watchdog timeout. The stacktraces were a bit suspicious so it's theoretically possible for this to be something else.
To be more certain of the source of a SIGABRT, MaxScale should somehow figure out if the SIGABRT relates to a watchdog timeout or something else.
markus makela
made changes -
2024-08-09 06:35
Summary
Scan systemd journal on SIGABRT for watchdog timeout
Detect if SIGABRT is due to a watchdog timeout
markus makela
made changes -
2024-08-09 06:39
Description
In MXS-5187 a SIGABRT was sent due to a potential systemd watchdog timeout. The stacktraces were a bit suspicious so it's theoretically possible for this to be something else.
To be more certain of the source of a SIGABRT, MaxScale should somehow figure out if the SIGABRT relates to a watchdog timeout or something else.
h2. Problem
In MXS-5187 a SIGABRT was sent due to a potential systemd watchdog timeout. The stacktraces were a bit suspicious so it's theoretically possible for this to be something else.
To be more certain of the source of a SIGABRT, MaxScale should somehow figure out if the SIGABRT relates to a watchdog timeout or something else.
h2. Solutions
h3. Scan systemd journal for messages
This solution would give us with a 100% certainty an answer to the question. The problem with this approach is that due to the problems described in MXS-5196 , the messages cannot be read without being a part of the {{systemd-journal}} group. This could be added but given that MaxScale usually does not need this (maxlog=1 is the default), it, for the time being, is better left to the end user to choose whether to allow MaxScale to read the log.
h3. Track when the last watchdog notification was sent
The notification interval and the time when the last notification was sent is known by MaxScale. By logging this information in the signal handler, we'd be able to tell with a high likelihood whether the SIGABRT was due to a watchdog timeout simply by looking at when the last notification was sent and how often they should be sent. Since the notifications are sent twice as often as are needed, the difference in times should be very obvious.
markus makela
made changes -
2024-08-09 06:39
Fix Version/s
21.06
[ 26119
]
markus makela
made changes -
2024-08-09 06:54
Status
Open
[ 1
]
In Progress
[ 3
]
markus makela
made changes -
2024-08-09 08:00
Status
In Progress
[ 3
]
In Review
[ 10002
]
markus makela
made changes -
2024-08-28 06:09
Component/s
Core
[ 11600
]
Fix Version/s
21.06.17
[ 29842
]
Fix Version/s
22.08.14
[ 29843
]
Fix Version/s
23.02.11
[ 29844
]
Fix Version/s
23.08.7
[ 29845
]
Fix Version/s
24.02.3
[ 29846
]
Fix Version/s
24.08.1
[ 29917
]
Fix Version/s
21.06
[ 26119
]
Resolution
Fixed
[ 1
]
Status
In Review
[ 10002
]
Closed
[ 6
]
{"report":{"fcp":874.5999999046326,"ttfb":259.09999990463257,"pageVisibility":"visible","entityId":130250,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":1,"journeyId":"ca54c6e5-8697-412d-8766-53fe8cc1ddfa","navigationType":0,"readyForUser":950,"redirectCount":0,"resourceLoadedEnd":981.7999997138977,"resourceLoadedStart":274.40000009536743,"resourceTiming":[{"duration":37,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":274.40000009536743,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":274.40000009536743,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":311.40000009536743,"responseStart":0,"secureConnectionStart":0},{"duration":37.40000009536743,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":274.69999980926514,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":274.69999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":312.09999990463257,"responseStart":0,"secureConnectionStart":0},{"duration":93.19999980926514,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":274.90000009536743,"connectEnd":274.90000009536743,"connectStart":274.90000009536743,"domainLookupEnd":274.90000009536743,"domainLookupStart":274.90000009536743,"fetchStart":274.90000009536743,"redirectEnd":0,"redirectStart":0,"requestStart":274.90000009536743,"responseEnd":368.09999990463257,"responseStart":368.09999990463257,"secureConnectionStart":274.90000009536743},{"duration":175.40000009536743,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":275.09999990463257,"connectEnd":275.09999990463257,"connectStart":275.09999990463257,"domainLookupEnd":275.09999990463257,"domainLookupStart":275.09999990463257,"fetchStart":275.09999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":275.09999990463257,"responseEnd":450.5,"responseStart":450.5,"secureConnectionStart":275.09999990463257},{"duration":179.39999961853027,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":275.40000009536743,"connectEnd":275.40000009536743,"connectStart":275.40000009536743,"domainLookupEnd":275.40000009536743,"domainLookupStart":275.40000009536743,"fetchStart":275.40000009536743,"redirectEnd":0,"redirectStart":0,"requestStart":275.40000009536743,"responseEnd":454.7999997138977,"responseStart":454.69999980926514,"secureConnectionStart":275.40000009536743},{"duration":179.59999990463257,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":275.59999990463257,"connectEnd":275.59999990463257,"connectStart":275.59999990463257,"domainLookupEnd":275.59999990463257,"domainLookupStart":275.59999990463257,"fetchStart":275.59999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":275.59999990463257,"responseEnd":455.19999980926514,"responseStart":455.19999980926514,"secureConnectionStart":275.59999990463257},{"duration":180.30000019073486,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":275.69999980926514,"connectEnd":275.69999980926514,"connectStart":275.69999980926514,"domainLookupEnd":275.69999980926514,"domainLookupStart":275.69999980926514,"fetchStart":275.69999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":275.69999980926514,"responseEnd":456,"responseStart":456,"secureConnectionStart":275.69999980926514},{"duration":243.69999980926514,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":275.90000009536743,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":275.90000009536743,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":519.5999999046326,"responseStart":0,"secureConnectionStart":0},{"duration":180.59999990463257,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":276.09999990463257,"connectEnd":276.09999990463257,"connectStart":276.09999990463257,"domainLookupEnd":276.09999990463257,"domainLookupStart":276.09999990463257,"fetchStart":276.09999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":276.09999990463257,"responseEnd":456.69999980926514,"responseStart":456.69999980926514,"secureConnectionStart":276.09999990463257},{"duration":243.40000009536743,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":276.2999997138977,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":276.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":519.6999998092651,"responseStart":0,"secureConnectionStart":0},{"duration":181.09999990463257,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":276.5,"connectEnd":276.5,"connectStart":276.5,"domainLookupEnd":276.5,"domainLookupStart":276.5,"fetchStart":276.5,"redirectEnd":0,"redirectStart":0,"requestStart":276.5,"responseEnd":457.59999990463257,"responseStart":457.59999990463257,"secureConnectionStart":276.5},{"duration":368,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":277.40000009536743,"connectEnd":277.40000009536743,"connectStart":277.40000009536743,"domainLookupEnd":277.40000009536743,"domainLookupStart":277.40000009536743,"fetchStart":277.40000009536743,"redirectEnd":0,"redirectStart":0,"requestStart":277.40000009536743,"responseEnd":645.4000000953674,"responseStart":645.4000000953674,"secureConnectionStart":277.40000009536743},{"duration":683.6999998092651,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":277.5,"connectEnd":277.5,"connectStart":277.5,"domainLookupEnd":277.5,"domainLookupStart":277.5,"fetchStart":277.5,"redirectEnd":0,"redirectStart":0,"requestStart":277.5,"responseEnd":961.1999998092651,"responseStart":961.1999998092651,"secureConnectionStart":277.5},{"duration":96,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":549.7999997138977,"connectEnd":549.7999997138977,"connectStart":549.7999997138977,"domainLookupEnd":549.7999997138977,"domainLookupStart":549.7999997138977,"fetchStart":549.7999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":549.7999997138977,"responseEnd":645.7999997138977,"responseStart":645.7999997138977,"secureConnectionStart":549.7999997138977},{"duration":189.19999980926514,"initiatorType":"link","name":"https://jira.mariadb.org/s/d5715adaadd168a9002b108b2b039b50-CDN/lu2cib/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/css/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":792.5,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":792.5,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":981.6999998092651,"responseStart":0,"secureConnectionStart":0},{"duration":188.69999980926514,"initiatorType":"link","name":"https://jira.mariadb.org/s/50bc9be5bfead1a25e72c1a9338c94f6-CDN/lu2cib/820016/12ta74/e108c7645258ccb43280ed3404e3e949/_/download/contextbatch/css/com.atlassian.jira.plugins.jira-development-integration-plugin:0,-_super,-jira.view.issue,-jira.global,-jira.general,-jira.browse.project,-project.issue.navigator,-atl.general/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":793.0999999046326,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":793.0999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":981.7999997138977,"responseStart":0,"secureConnectionStart":0}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":69,"responseStart":259,"responseEnd":261,"domLoading":271,"domInteractive":1013,"domContentLoadedEventStart":1013,"domContentLoadedEventEnd":1066,"domComplete":1410,"loadEventStart":1410,"loadEventEnd":1411,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":994.7999997138977},{"name":"bigPipe.sidebar-id.end","time":995.6999998092651},{"name":"bigPipe.activity-panel-pipe-id.start","time":995.7999997138977},{"name":"bigPipe.activity-panel-pipe-id.end","time":999.4000000953674},{"name":"activityTabFullyLoaded","time":1090.0999999046326}],"measures":[],"correlationId":"82839e6902e04d","effectiveType":"4g","downlink":9.2,"rtt":0,"serverDuration":137,"dbReadsTimeInMs":25,"dbConnsTimeInMs":36,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}