We are experiencing technical difficulties with the latest MariaDB 10.1.41-MariaDB.
This is only happening on one server while we have more with the same system package versions.
The database is freezing and does not accept new connections.
The error_log shows so much error data eg:
InnoDB: Warning: a long semaphore wait:
--Thread 140300680931072 has waited at dict0dict.cc line 984for241.00 seconds the semaphore:
Mutex at 0x7f9e26c112e8'&dict_sys->mutex', lock var 1
Last time reserved by thread 140300697716480 in file not yet reserved line 0, waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140300680931072 has waited at dict0dict.cc line 984for241.00 seconds the semaphore:
Mutex at 0x7f9e26c112e8'&dict_sys->mutex', lock var 1
Last time reserved by thread 140300697716480 in file not yet reserved line 0, waiters flag 1
We can provide more error log data but not in a public.
- per Thiru: Unlikely that its caused by the changes in bb-10.3-thiru
- occuring only once == Attempts to replay that on actual 10.2 have a too low chance
https://jira.mariadb.org/browse/MDEV-20843
Matthias Leich
added a comment - - edited
Results of RQG testing on bb-10.2-thiru commit 0b91f74906c8dcbcc1dac486fcc66c1e9c0c603a
- > 1500 RQG tests were executed
There was some surprising low fraction of failing tests.
All asserts/crashes are already covered by open bugs in JIRA except one
- mysqld: sql/sql_list.h:684: void ilink::assert_linked(): Assertion `prev != 0 && next != 0' failed.
happening during shutdown of the server
- per Thiru: Unlikely that its caused by the changes in bb-10.3-thiru
- occuring only once == Attempts to replay that on actual 10.2 have a too low chance
https://jira.mariadb.org/browse/MDEV-20843
This is a welcome step to the right direction, but I think that this needs some more work.
First of all, the in_queue should not be stored in a bit-field that is shared with other bit-fields that are protected by a different mutex.
I would suggest to use bool, and to document the possible state transitions carefully. We might consider using atomic memory access.
Second, in 10.1, fts_optimize_init() is not adding tables to the queue, while in 10.2 it is doing that. I’d like to see a 10.1 patch that does this. It should also avoid the unnecessary use of std::vector.
Third, fts_optimize_remove_table() should assert !table->fts->in_queue in the end.
Marko Mäkelä
added a comment - This is a welcome step to the right direction, but I think that this needs some more work.
First of all, the in_queue should not be stored in a bit-field that is shared with other bit-fields that are protected by a different mutex.
I would suggest to use bool , and to document the possible state transitions carefully. We might consider using atomic memory access.
Second, in 10.1, fts_optimize_init() is not adding tables to the queue, while in 10.2 it is doing that. I’d like to see a 10.1 patch that does this. It should also avoid the unnecessary use of std::vector .
Third, fts_optimize_remove_table() should assert !table->fts->in_queue in the end.
At the end of fts_optimize_remove_table(), the fts_optimize_wq->mutex acquisition and release around the debug assertion should be inside ut_d(), to avoid unnecessary operations on the release build.
I saw a redundant sync_table = mem_heap_alloc(…) call whose result was immediately overwritten by {{sync_table=table;}
In fts_optimize_new_table() the assignment slot->running = false is redundant because of a preceding memset() call.
If fts_slots can be accessed by multiple threads, then we should extend some mutex hold time. It could be that it is only being accessed by a single thread.
Should we call fts_init_index() already on ha_innobase::open()? Otherwise, it seems that FTS-indexed columns could be updated before any fulltext search is performed (and ha_innobase::ft_init_ext() is called). Could that lead to some updates being missed by the fulltext indexes?
Finally, please check the following for differences in white-space or comments, and try to fix those:
diff -I^@@ <(git show origin/bb-10.1-thiru storage/innobase) <(git show origin/bb-10.1-thiru storage/xtradb/)
git show origin/bb-10.2-thiru|diff -^@@ - <(git show origin/bb-10.1-thiru storage/innobase)
Marko Mäkelä
added a comment - At the end of fts_optimize_remove_table() , the fts_optimize_wq->mutex acquisition and release around the debug assertion should be inside ut_d() , to avoid unnecessary operations on the release build.
I saw a redundant sync_table = mem_heap_alloc(…) call whose result was immediately overwritten by {{sync_table=table;}
In fts_optimize_new_table() the assignment slot->running = false is redundant because of a preceding memset() call.
If fts_slots can be accessed by multiple threads, then we should extend some mutex hold time. It could be that it is only being accessed by a single thread.
Should we call fts_init_index() already on ha_innobase::open() ? Otherwise, it seems that FTS-indexed columns could be updated before any fulltext search is performed (and ha_innobase::ft_init_ext() is called). Could that lead to some updates being missed by the fulltext indexes?
Finally, please check the following for differences in white-space or comments, and try to fix those:
diff -I^@@ <(git show origin/bb-10.1-thiru storage/innobase) <(git show origin/bb-10.1-thiru storage/xtradb/)
git show origin/bb-10.2-thiru|diff -^@@ - <(git show origin/bb-10.1-thiru storage/innobase)
Thanks, this looks OK. I made a suggestion to declare fts_optimize_wq) without static scope, to avoid having to add trivial non-inline accessor functions.
Marko Mäkelä
added a comment - Thanks, this looks OK. I made a suggestion to declare fts_optimize_wq ) without static scope, to avoid having to add trivial non- inline accessor functions.
So from my point of view the MDEV-20621 patch is ok.
Matthias Leich
added a comment -
I tested the tree bb-10.2-thiru commit ce813ca178e499ab2171978bf0140537cb9ca612 which contains
patches for the current MDEV.
There were no asserts/crashes which do not occur in actual
10.2 commit 28098420317bc2efe082df799c917babde879242
too.
So from my point of view the MDEV-20621 patch is ok.
People
Thirunarayanan Balathandayuthapani
Stevo
Votes:
0Vote for this issue
Watchers:
5Start watching this issue
Dates
Created:
Updated:
Resolved:
Git Integration
Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.
{"report":{"fcp":818.1999998092651,"ttfb":255.19999980926514,"pageVisibility":"visible","entityId":79059,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":1,"journeyId":"e2f8c7eb-ec79-4293-9158-2bff4b77a7d1","navigationType":0,"readyForUser":904.7999997138977,"redirectCount":0,"resourceLoadedEnd":1505.7999997138977,"resourceLoadedStart":261.40000009536743,"resourceTiming":[{"duration":28.5,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":261.40000009536743,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":261.40000009536743,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":289.90000009536743,"responseStart":0,"secureConnectionStart":0},{"duration":28.59999990463257,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":261.59999990463257,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":261.59999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":290.19999980926514,"responseStart":0,"secureConnectionStart":0},{"duration":84.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":261.90000009536743,"connectEnd":261.90000009536743,"connectStart":261.90000009536743,"domainLookupEnd":261.90000009536743,"domainLookupStart":261.90000009536743,"fetchStart":261.90000009536743,"redirectEnd":0,"redirectStart":0,"requestStart":261.90000009536743,"responseEnd":346.40000009536743,"responseStart":346.40000009536743,"secureConnectionStart":261.90000009536743},{"duration":146.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":262,"connectEnd":262,"connectStart":262,"domainLookupEnd":262,"domainLookupStart":262,"fetchStart":262,"redirectEnd":0,"redirectStart":0,"requestStart":262,"responseEnd":408.5,"responseStart":408.5,"secureConnectionStart":262},{"duration":150,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":262.2999997138977,"connectEnd":262.2999997138977,"connectStart":262.2999997138977,"domainLookupEnd":262.2999997138977,"domainLookupStart":262.2999997138977,"fetchStart":262.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":262.2999997138977,"responseEnd":412.2999997138977,"responseStart":412.2999997138977,"secureConnectionStart":262.2999997138977},{"duration":150.2999997138977,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":262.5,"connectEnd":262.5,"connectStart":262.5,"domainLookupEnd":262.5,"domainLookupStart":262.5,"fetchStart":262.5,"redirectEnd":0,"redirectStart":0,"requestStart":262.5,"responseEnd":412.7999997138977,"responseStart":412.7999997138977,"secureConnectionStart":262.5},{"duration":150.5,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":262.59999990463257,"connectEnd":262.59999990463257,"connectStart":262.59999990463257,"domainLookupEnd":262.59999990463257,"domainLookupStart":262.59999990463257,"fetchStart":262.59999990463257,"redirectEnd":0,"redirectStart":0,"requestStart":262.59999990463257,"responseEnd":413.09999990463257,"responseStart":413.09999990463257,"secureConnectionStart":262.59999990463257},{"duration":227.5,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":262.7999997138977,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":262.7999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":490.2999997138977,"responseStart":0,"secureConnectionStart":0},{"duration":150.59999990463257,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":263,"connectEnd":263,"connectStart":263,"domainLookupEnd":263,"domainLookupStart":263,"fetchStart":263,"redirectEnd":0,"redirectStart":0,"requestStart":263,"responseEnd":413.59999990463257,"responseStart":413.59999990463257,"secureConnectionStart":263},{"duration":227.30000019073486,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":263.19999980926514,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":263.19999980926514,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":490.5,"responseStart":0,"secureConnectionStart":0},{"duration":150.90000009536743,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":263.2999997138977,"connectEnd":263.2999997138977,"connectStart":263.2999997138977,"domainLookupEnd":263.2999997138977,"domainLookupStart":263.2999997138977,"fetchStart":263.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":263.2999997138977,"responseEnd":414.19999980926514,"responseStart":414.19999980926514,"secureConnectionStart":263.2999997138977},{"duration":1238,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":264.2999997138977,"connectEnd":264.2999997138977,"connectStart":264.2999997138977,"domainLookupEnd":264.2999997138977,"domainLookupStart":264.2999997138977,"fetchStart":264.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":264.2999997138977,"responseEnd":1502.2999997138977,"responseStart":1502.2999997138977,"secureConnectionStart":264.2999997138977},{"duration":1238.2999997138977,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":264.40000009536743,"connectEnd":264.40000009536743,"connectStart":264.40000009536743,"domainLookupEnd":264.40000009536743,"domainLookupStart":264.40000009536743,"fetchStart":264.40000009536743,"redirectEnd":0,"redirectStart":0,"requestStart":264.40000009536743,"responseEnd":1502.6999998092651,"responseStart":1502.6999998092651,"secureConnectionStart":264.40000009536743},{"duration":136.59999990463257,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":518.6999998092651,"connectEnd":518.6999998092651,"connectStart":518.6999998092651,"domainLookupEnd":518.6999998092651,"domainLookupStart":518.6999998092651,"fetchStart":518.6999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":518.6999998092651,"responseEnd":655.2999997138977,"responseStart":655.2999997138977,"secureConnectionStart":518.6999998092651},{"duration":734,"initiatorType":"link","name":"https://jira.mariadb.org/s/d5715adaadd168a9002b108b2b039b50-CDN/lu2cib/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/css/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":769.4000000953674,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":769.4000000953674,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1503.4000000953674,"responseStart":0,"secureConnectionStart":0},{"duration":732.9000000953674,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/e65b778d185daf5aee24936755b43da6/_/download/contextbatch/js/browser-metrics-plugin.contrib,-_super,-project.issue.navigator,-jira.view.issue,-atl.general/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":770.5999999046326,"connectEnd":770.5999999046326,"connectStart":770.5999999046326,"domainLookupEnd":770.5999999046326,"domainLookupStart":770.5999999046326,"fetchStart":770.5999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":770.5999999046326,"responseEnd":1503.5,"responseStart":1503.5,"secureConnectionStart":770.5999999046326},{"duration":734.7999997138977,"initiatorType":"script","name":"https://jira.mariadb.org/s/097ae97cb8fbec7d6ea4bbb1f26955b9-CDN/lu2cib/820016/12ta74/be4b45e9cec53099498fa61c8b7acba4/_/download/contextbatch/js/jira.project.sidebar,-_super,-project.issue.navigator,-jira.general,-jira.browse.project,-jira.view.issue,-jira.global,-atl.general,-com.atlassian.jira.projects.sidebar.init/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":771,"connectEnd":771,"connectStart":771,"domainLookupEnd":771,"domainLookupStart":771,"fetchStart":771,"redirectEnd":0,"redirectStart":0,"requestStart":771,"responseEnd":1505.7999997138977,"responseStart":1505.7999997138977,"secureConnectionStart":771},{"duration":782.0999999046326,"initiatorType":"script","name":"https://www.google-analytics.com/analytics.js","startTime":811.5999999046326,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":811.5999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1593.6999998092651,"responseStart":0,"secureConnectionStart":0}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":89,"responseStart":256,"responseEnd":258,"domLoading":259,"domInteractive":1546,"domContentLoadedEventStart":1546,"domContentLoadedEventEnd":1591,"domComplete":1960,"loadEventStart":1960,"loadEventEnd":1960,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":1513},{"name":"bigPipe.sidebar-id.end","time":1513.9000000953674},{"name":"bigPipe.activity-panel-pipe-id.start","time":1514},{"name":"bigPipe.activity-panel-pipe-id.end","time":1516.1999998092651},{"name":"activityTabFullyLoaded","time":1610}],"measures":[],"correlationId":"b68490f0ea51cc","effectiveType":"4g","downlink":9,"rtt":0,"serverDuration":109,"dbReadsTimeInMs":15,"dbConnsTimeInMs":24,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}
Results of RQG testing on bb-10.2-thiru commit 0b91f74906c8dcbcc1dac486fcc66c1e9c0c603a
- > 1500 RQG tests were executed
There was some surprising low fraction of failing tests.
All asserts/crashes are already covered by open bugs in JIRA except one
- mysqld: sql/sql_list.h:684: void ilink::assert_linked(): Assertion `prev != 0 && next != 0' failed.
happening during shutdown of the server
- per Thiru: Unlikely that its caused by the changes in bb-10.3-thiru
- occuring only once == Attempts to replay that on actual 10.2 have a too low chance
https://jira.mariadb.org/browse/MDEV-20843