[MDEV-5642] Galera: A long semaphore wait during normal operation Created: 2014-02-10  Updated: 2014-10-28  Resolved: 2014-10-28

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - XtraDB
Affects Version/s: 5.5.34-galera
Fix Version/s: 5.5.40-galera

Type: Bug Priority: Major
Reporter: Ryo Tagami Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: galera
Environment:

OS: CentOS 6.5 (kernel 2.6.32-431.3.1.el6.x86_64)
Package: MariaDB-Galera-server-5.5.34-1.x86_64 + galera-23.2.7-1.rhel6.x86_64


Attachments: Text File mysqld.log    

 Description   

This problem happend on 3 node MariaDB Galera Server cluster as a backend for zabbix.

Everything was fine until these line appeared on the log.
Log from zabbix-server indicates that around 12:42:07, query execution has stopped
(which matches the time frame from log of mysqld).

140208 21:35:10 [Warning] WSREP: last inactive check more than PT1.5S ago (PT2.30767S), skipping check
InnoDB: Warning: a long semaphore wait:
--Thread 140414024693504 has waited at row0ins.c line 2087 for 241.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7fb2a39cbc70 '&block->lock'
a writer (thread id 140414024693504) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file row0sel.c line 3138
Last time write locked in file /home/buildbot/buildbot/build/mariadb-5.5.34/storage/xtradb/btr/btr0sea.c line 947
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0
 
=====================================
140209 12:45:53 INNODB MONITOR OUTPUT
=====================================

:
:

----------------------------
END OF INNODB MONITOR OUTPUT
============================
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.

For reference, I've attached excerpt of mysqld log.
Please let me know if some other information is needed.



 Comments   
Comment by Jan Lindström (Inactive) [ 2014-03-11 ]

Hi,

Do you have some very long running transactions ? E.g. following seems to take a lot of record locks:

delete from history_uint where itemid=397 and clock<1391312480

Comment by Ryo Tagami [ 2014-03-12 ]

I'm not sure it qualifies as a long running transaction, but they actually are somewhat heavy operations.
history_uint table has about 50,000,000 rows, and the query delete about 100 records from it.
Application iterates the query with different itemid about 2000 times.
It seems that index is working though.

MariaDB [zabbix]> EXPLAIN SELECT COUNT(1) FROM history_uint WHERE itemid = 5051 AND clock < 1394023217;
+------+-------------+--------------+-------+----------------+----------------+---------+------+------+--------------------------+
| id   | select_type | table        | type  | possible_keys  | key            | key_len | ref  | rows | Extra                    |
+------+-------------+--------------+-------+----------------+----------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | history_uint | range | history_uint_1 | history_uint_1 | 12      | NULL |   68 | Using where; Using index |
+------+-------------+--------------+-------+----------------+----------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

Generated at Thu Feb 08 07:05:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.