[MDEV-25111] Long semaphore wait (> 800 secs), server stops responding Created: 2021-03-10 Updated: 2021-04-29 Resolved: 2021-04-29 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Storage Engine - InnoDB |
| Affects Version/s: | 10.3.28 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Ere Maijala | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Incomplete | Votes: | 1 |
| Labels: | need_feedback | ||
| Environment: |
CentOS 7.9.2009 Defaults: mysqld would have been started with the following arguments: |
||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Description |
|
We had been running MariaDB 10.3.27 with Galera cluster in production without any issues. Less than 26 hours after updating to 10.3.28 one of three servers stopped responding. The following errors were logged for more than 800 seconds until I restarted MariaDB:
I'll attach the full log from when the problem started. The log also contains InnoDB Monitor output. This sounds similar to |
| Comments |
| Comment by Marko Mäkelä [ 2021-03-10 ] |
|
emaijala, yes, this must be different from The InnoDB diagnostic output is virtually useless for diagnosing hangs or deadlocks. (In fact, it will be removed in 10.6: For anything nontrivial, we really need stack traces of all threads during the hang or performance degradation. You can use some tool like http://poormansprofiler.org/ as well. But, without some stack trace output, I am afraid that we cannot proceed anywhere. Two notable Galera-related changes in the most recent releases that might have something to do with this were |
| Comment by Ere Maijala [ 2021-03-11 ] |
|
@Marko Mäkelä, thanks for the information. I've now prepared to get stack traces in case we don't decide to downgrade to 10.3.27. |
| Comment by JDT [ 2021-03-19 ] |
|
We are also experiencing this issue since upgrading to 10.3.28. We're also using Galera and CentOS 7.9.2009. I've attached a log below. I will also try to get some more debug information from our Galera instance. I'll try to dump the running queries to a file every minute and try to narrow down the query that triggers the problem if possible. |
| Comment by Marko Mäkelä [ 2021-03-24 ] |
|
|