[MDEV-15431] lock_sys->mutex happened Created: 2018-02-27  Updated: 2019-12-12  Resolved: 2019-12-12

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.0.34-galera
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Hamoon Mohammadian Pour Assignee: Jan Lindström (Inactive)
Resolution: Won't Fix Votes: 1
Labels: deadlock, galera, xtradb
Environment:

Centos 7


Attachments: Text File error_log.log    
Issue Links:
Relates
relates to MDEV-8869 Potential lock_sys->mutex deadlock Closed

 Description   

We have two nodes and one arbitrator server.
From today one of our nodes that use only for read (Write routes to another node) after some hours stuck and can't service to a client.
When I want to connect to this node I get too many connections error.
After this, I have to kill mysql process and restart node to join to cluster again.
This problem occurs only on this node.
It's so strange for me because I thought this problem solved on mariadb 10.0.23.



 Comments   
Comment by Elena Stepanova [ 2018-02-27 ]

From the attached error log:

InnoDB: Warning: semaphore wait:
--Thread 140500975691520 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1
waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140500967298816 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1
waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140501042833152 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1
waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140501026047744 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1
waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140501034440448 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1
waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140500992476928 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1
waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140500942120704 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1
waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140501068011264 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1
waiters flag 1
InnoDB: Warning: semaphore wait:
--Thread 140499893602048 has waited at btr0cur.cc line 1107 for 301.00 seconds the semaphore:
S-lock on RW-latch at 0x7fe057b6a498 '&new_index->lock'
a writer (thread id 140501068011264) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: fffffffffff00000
Last time read locked in file btr0cur.cc line 592
Last time write locked in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.0.34/storage/xtradb/ibuf/ibuf0ibuf.cc line $
InnoDB: Warning: Writer thread is waiting this semaphore:
--Thread 140501068011264 has waited at lock0lock.cc line 3828 for 302.00 seconds the semaphore:
Mutex at 0x7fca58bb8068 '&lock_sys->mutex', lock var 1

Comment by Jan Lindström (Inactive) [ 2018-03-01 ]

Can you provide full unedited error log and output on show engine innodb status when this kind of messages appear. Do you have general log enabled that would also help identify the issue. Is there some steps to repeat the issue or can you explain what kind of situation this happens?

Comment by Hamoon Mohammadian Pour [ 2018-03-06 ]

Unfortunately, I had to stop this node, because it affected the whole system
But let me explain the problem perfectly.
a few days ago we had to turn off the server to make some changes.
This change took several hours.
after that, we turn on the server and start MariaDB,
MariaDB started normally and tried to use IST to get the missing transactions.
After MariaDB synced to cluster, after an hour this problem happened on this node.
When I saw this problem I tried to connect to this node but I got too many connections.
My efforts to connect to the other server was fruitless because I got too many connections again.
I had to kill MariaDB process and start MariaDB again.
So after MariaDB synced to cluster again, after a few minutes the problem happened again.
It is interesting that before we turned off the server, the problem never happened.
Of course, the Internet had a disruption in these days but I think it doesn't relate to this problem

Comment by Hamoon Mohammadian Pour [ 2018-03-10 ]

Two days ago I started this node again.
It used SST method and got the full snapshot of our Data.
After that, the problem never happened again!

Comment by Jan Lindström (Inactive) [ 2019-12-12 ]

Support for 10.0-galera has ended.

Generated at Thu Feb 08 08:21:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.