[MDEV-24504] [FATAL] InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung. - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Duplicate
Affects Version/s: 10.5.8
Fix Version/s: 10.2.37, 10.3.28, 10.4.18, 10.5.9
Component/s: Server, Storage Engine - InnoDB
Labels:
- need_feedback
- regression
Environment:
Ubuntu 16.04
Version: '10.5.8-MariaDB-1:10.5.8+maria~xenial-log'

Description

We've upgraded several servers from the distro provided MariaDB to 10.5.8. Since then all upgraded servers crash every few days with:

--Thread 140006271878912 has waited at btr0cur.cc line 1480 for 611.00 seconds the semaphore:
SX-lock on RW-latch at 0x55aba8228a10 created in file dict0dict.cc line 2161
a writer (thread id 140005582296832) has reserved it in mode SX
number of readers 1, waiters flag 1, lock_word: fffffff
Last time write locked in file dict0stats.cc line 1969

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mariadb_10.5.8_error_log.txt
2 kB
2021-01-06 04:37

Issue Links

is duplicated by

MDEV-24188 Hang in buf_page_create() after reusing a previously freed page

Closed

MDEV-24275 InnoDB persistent stats analyze forces full scan forcing lock crash

Closed

relates to

MDEV-24378 Crashes on Semaphore wait > 600 seconds

Closed

MDEV-24606 InnoDB: Semaphore wait has lasted > 600 second

Closed

Activity

Ascending order - Click to sort in descending order

Jacek Kuczynski added a comment - 2021-01-06 04:53

We have the very same problem.
Recently upgraded to 10.5.8 on Ubuntu20.
Server is on AWS. DB is used for Zabbix.
We have two MariaDB instances running on the same server. One is using ephemeral SSD storage and there is no problem with that. The other one is using EBS (gp3 [upgraded from gp2 at the same time]) and it's crashing with the same error.

Attached the last entry in the error log. After crash there are no more entries in the log. Both instances (the crashed one) and the new one are trying to write to the same error log file:

lsof error_mariadb_ebs.log
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
mariadbd 1932433 mysql 1w REG 202,1 828870 518224 error_mariadb_ebs.log
mariadbd 1932433 mysql 2w REG 202,1 828870 518224 error_mariadb_ebs.log
mariadbd 2220904 mysql 2w REG 202,1 828870 518224 error_mariadb_ebs.log

After upgrade to 10.5.8 when DB crashed it was fully restarted. After second crash I decided to let it go as it is and after the initial crash there are no more crashes of DB (for last 9 days).

I will additionally check if this problem is happening also on AWS EBS gp2 type storage.

Jacek Kuczynski added a comment - 2021-01-06 04:53 We have the very same problem. Recently upgraded to 10.5.8 on Ubuntu20. Server is on AWS. DB is used for Zabbix. We have two MariaDB instances running on the same server. One is using ephemeral SSD storage and there is no problem with that. The other one is using EBS (gp3 [upgraded from gp2 at the same time] ) and it's crashing with the same error. Attached the last entry in the error log. After crash there are no more entries in the log. Both instances (the crashed one) and the new one are trying to write to the same error log file: lsof error_mariadb_ebs.log COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mariadbd 1932433 mysql 1w REG 202,1 828870 518224 error_mariadb_ebs.log mariadbd 1932433 mysql 2w REG 202,1 828870 518224 error_mariadb_ebs.log mariadbd 2220904 mysql 2w REG 202,1 828870 518224 error_mariadb_ebs.log After upgrade to 10.5.8 when DB crashed it was fully restarted. After second crash I decided to let it go as it is and after the initial crash there are no more crashes of DB (for last 9 days). I will additionally check if this problem is happening also on AWS EBS gp2 type storage.

Marko Mäkelä added a comment - 2021-02-26 08:38

Without having stack traces of all threads during the hang, it is impossible to analyze this. This could be a duplicate of ~~MDEV-24188~~ or ~~MDEV-24275~~. Does the 10.5.9 release work?

Marko Mäkelä added a comment - 2021-02-26 08:38 Without having stack traces of all threads during the hang, it is impossible to analyze this. This could be a duplicate of MDEV-24188 or MDEV-24275 . Does the 10.5.9 release work?

Fredrik Chabot added a comment - 2021-03-19 16:09

Version 10.5.9 seems to have resolved this issue.

Fredrik Chabot added a comment - 2021-03-19 16:09 Version 10.5.9 seems to have resolved this issue.

People

Assignee:: Marko Mäkelä

Reporter:: Fredrik Chabot

Votes:: 3 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2020-12-31 09:11

Updated:: 2021-04-18 18:09

Resolved:: 2021-04-18 18:08

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration