[MDEV-24504] [FATAL] InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung. Created: 2020-12-31  Updated: 2021-04-18  Resolved: 2021-04-18

Status: Closed
Project: MariaDB Server
Component/s: Server, Storage Engine - InnoDB
Affects Version/s: 10.5.8
Fix Version/s: 10.2.37, 10.3.28, 10.4.18, 10.5.9

Type: Bug Priority: Major
Reporter: Fredrik Chabot Assignee: Marko Mäkelä
Resolution: Duplicate Votes: 3
Labels: need_feedback, regression
Environment:

Ubuntu 16.04
Version: '10.5.8-MariaDB-1:10.5.8+maria~xenial-log'


Attachments: Text File mariadb_10.5.8_error_log.txt    
Issue Links:
Duplicate
is duplicated by MDEV-24188 Hang in buf_page_create() after reusi... Closed
is duplicated by MDEV-24275 InnoDB persistent stats analyze force... Closed
Relates
relates to MDEV-24378 Crashes on Semaphore wait > 600 seconds Closed
relates to MDEV-24606 InnoDB: Semaphore wait has lasted > 6... Closed

 Description   

We've upgraded several servers from the distro provided MariaDB to 10.5.8. Since then all upgraded servers crash every few days with:

--Thread 140006271878912 has waited at btr0cur.cc line 1480 for 611.00 seconds the semaphore:
SX-lock on RW-latch at 0x55aba8228a10 created in file dict0dict.cc line 2161
a writer (thread id 140005582296832) has reserved it in mode SX
number of readers 1, waiters flag 1, lock_word: fffffff
Last time write locked in file dict0stats.cc line 1969



 Comments   
Comment by Jacek Kuczynski [ 2021-01-06 ]

We have the very same problem.
Recently upgraded to 10.5.8 on Ubuntu20.
Server is on AWS. DB is used for Zabbix.
We have two MariaDB instances running on the same server. One is using ephemeral SSD storage and there is no problem with that. The other one is using EBS (gp3 [upgraded from gp2 at the same time]) and it's crashing with the same error.

Attached the last entry in the error log. After crash there are no more entries in the log. Both instances (the crashed one) and the new one are trying to write to the same error log file:

  1. lsof error_mariadb_ebs.log
    COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
    mariadbd 1932433 mysql 1w REG 202,1 828870 518224 error_mariadb_ebs.log
    mariadbd 1932433 mysql 2w REG 202,1 828870 518224 error_mariadb_ebs.log
    mariadbd 2220904 mysql 2w REG 202,1 828870 518224 error_mariadb_ebs.log

After upgrade to 10.5.8 when DB crashed it was fully restarted. After second crash I decided to let it go as it is and after the initial crash there are no more crashes of DB (for last 9 days).

I will additionally check if this problem is happening also on AWS EBS gp2 type storage.

Comment by Marko Mäkelä [ 2021-02-26 ]

Without having stack traces of all threads during the hang, it is impossible to analyze this. This could be a duplicate of MDEV-24188 or MDEV-24275. Does the 10.5.9 release work?

Comment by Fredrik Chabot [ 2021-03-19 ]

Version 10.5.9 seems to have resolved this issue.

Generated at Thu Feb 08 09:30:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.