[MDEV-23166] mysql.err log file is spammed with message "line 593 for 241.00 seconds the semaphore" Created: 2020-07-14  Updated: 2022-01-27  Resolved: 2022-01-24

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1.41
Fix Version/s: 10.5.14, 10.6.6, 10.7.2, 10.8.1

Type: Bug Priority: Major
Reporter: sushma k Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None
Environment:

mysqld, Version: 10.1.41-MariaDB-0+deb9u1 (Debian 9.9)



 Description   

We have a linux debian VM on which mysqld, Version: 10.1.41-MariaDB-0+deb9u1 is running.

Problem faced:
----------------
From June 17th onward the mysql.err log file size has been increasing extremely and this is
filling up our /var/local/ file system due to which our other services are facing issue.

We tried to empty the file but again very quickly it gets filled up.
On June 17th in the mysql.err file I see some 5 to 6 occurrences of 'Long semaphore wait' but later on around 79GB of file is filled with below error message saying
"--Thread 140508016711424 has waited at btr0cur.cc line 593 for 241.00 seconds the semaphore"

Need your assistance to understand is this issue related to deadlock or is this seen due to
high load. I have tried to look into mysql-slow.log but could not infer much.

Sharing below some details from the log pattern I have seen
2020-06-17 20:03:20 140509551914368 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.1.41-MariaDB-0+deb9u1' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Debian 9.9
2020-06-17 20:03:41 7fcaea137700 InnoDB: Error: Table "mysql"."innodb_table_stats" not found.
2020-06-17 23:52:59 140509488289536 [Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 0 failed attempts to flush a page!
2020-06-17 23:52:59 140509488289536 [Note] InnoDB: Consider increasing the buffer pool size.
2020-06-17 23:52:59 140509488289536 [Note] InnoDB: Pending flushes (fsync) log: 1 buffer pool: 2 OS file reads: 373203 OS file writes: 1909646 OS fsyncs: 497485
InnoDB: Warning: a long semaphore wait:
--Thread 140508016711424 has waited at btr0cur.cc line 593 for 241.00 seconds the semaphore: =========== > 4 min
S-lock on RW-latch at 0x7fcaeac1dd88 '&new_index->lock'
a writer (thread id 140509276571392) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 593
Last time write locked in file ibuf0ibuf.cc line 417
Holder thread 0 file not yet reserved line 0
InnoDB: Warning: a long semaphore wait:
--Thread 140509487675136 has waited at btr0cur.cc line 593 for 241.00 seconds the semaphore:
S-lock on RW-latch at 0x7fcaeac1dd88 '&new_index->lock'
a writer (thread id 140509276571392) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 593
Last time write locked in file ibuf0ibuf.cc line 417
Holder thread 0 file not yet reserved line 0
....
....
// Later on the below pattern gets repeated for 79GB of data.
InnoDB: Warning: Writer thread is waiting this semaphore:
--Thread 140508140697344 has waited at btr0cur.cc line 593 for 241.00 seconds the semaphore:
S-lock on RW-latch at 0x7fcaeac1dd88 '&new_index->lock'
a writer (thread id 140509276571392) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 593
Last time write locked in file ibuf0ibuf.cc line 417
Holder thread 0 file not yet reserved line 0
InnoDB: Warning: Writer thread is waiting this semaphore:
--Thread 140509276571392 has waited at fut0fut.ic line 51 for 241.00 seconds the semaphore:
X-lock on RW-latch at 0x7fcaa23e6b80 '&block->lock'
a writer (thread id 140508140697344) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0sel.cc line 3250
Last time write locked in file buf0buf.cc line 4278
Holder thread 0 file not yet reserved line 0

mysql-slow log file and mysql-err log file is huge so i was not able to upload the same.
Any other alternative to upload please let me know.



 Comments   
Comment by Marko Mäkelä [ 2020-07-14 ]

sushma1, I agree that the InnoDB watchdog output is rather useless. It rarely provides enough information to find the actual cause of the hang.

One message suggests that the buffer pool might be configured too small, or that something is hogging the buffer pool.

I think that it would be very useful to attach a debugger to the server at the time of such a hang, to produce a stack trace of all active threads (thread apply all backtrace in GDB). That would provide much better clues to the cause of the hang. It could be even more useful to collect multiple such traces. http://poormansprofiler.org/ could be useful for this.

Comment by sushma k [ 2020-07-15 ]

Can this happen due to performance load or due to slow IO disks ?

Comment by sushma k [ 2020-07-15 ]

adding to my above query will setting innodb_adaptive_hash_index=0 and a restart be helpful ?
If yes how can i check if innodb_adaptive_hash_index is enabled or disabled ?

Comment by sushma k [ 2020-07-16 ]

Hi Marko,

Can I please get some reply to my questions.

Attaching a debugger etc. will it be practical in this case, because as soon as the issue is encountered then it falls over (7+ GB per minute of failure logging) and because we don’t have any known way to reproduce it. Any debug trace in this case is likely to be huge, too.

Please let me know if i should go on with suggesting to collect the stack trace or is there any other data collection plan.

Comment by sushma k [ 2020-07-16 ]

Adding another question to this :
As per your earlier point "One message suggests that the buffer pool might be configured too small, or that something is hogging the buffer pool."

Is the 'innodb_buffer_pool_size=1G' parameter in my.cfg file which will need to be tweaked into and restart the services and up to what value can we set this value to ?

Comment by Marko Mäkelä [ 2021-11-11 ]

Sorry, I had missed the updates in this ticket. But, I see that the stack trace output that I requested was not provided.

The 10.1 series reached its end of life in October 2020. How parameters should be configured depends on the workload. Our support engineers could help with that.

The LRU eviction was improved in MariaDB 10.5, in MDEV-23399. I would recommend to upgrade to MariaDB 10.5.12 or 10.5.13.

Comment by Marko Mäkelä [ 2022-01-24 ]

I think that MDEV-26784 in 10.5.14 finally fixed this. Before 10.5, recovery can run out of memory due to various other reasons.

Generated at Thu Feb 08 09:20:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.