[MDEV-17407] Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung. Created: 2018-10-09 Updated: 2018-12-07 Resolved: 2018-12-07 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.2.14 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Nigel Gomm | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | need_feedback | ||
| Environment: |
windows server 2012 |
||
| Attachments: |
|
| Description |
|
i seem to get this with ANY slow query or even DDL instruction on a large table. It's compounded by the server then insisting on doing a crash recovery which seems to take hours. today a customer's server was running out of disk space so i (foolishly) tried to purge some now redundant audit stuff with a "delete where like". And now my customer looks like being without a server for an entire business day. Is there a setting to suppress the initial timeout and 'intentional crash'? |
| Comments |
| Comment by Marko Mäkelä [ 2018-10-09 ] |
|
nigelgomm, you are being hit by the InnoDB watchdog. It is catching a long wait for an InnoDB internal RW-lock or mutex. These should normally not be held for more than a few milliseconds. The option that you asked for is called innodb_fatal_semaphore_wait_threshold. Its default value is 600 seconds. A hung server would typically not be killed after 600 seconds sharp, but more like after 900 seconds (15 minutes). Locks acquired by user transactions can be held for arbitrarily long times without InnoDB’s watchdog getting upset. I would like to see more data so that we can reproduce and diagnose the issue. Can you upload the server error log? |
| Comment by Nigel Gomm [ 2018-10-09 ] |
|
attached .err file |
| Comment by Elena Stepanova [ 2018-10-09 ] |
|
Also your ini file(s) please. |
| Comment by Nigel Gomm [ 2018-10-09 ] |
|
is it hitting innodb_fatal_semaphore_wait_threshold that causes the crashrecovery on restart? i.e. if i set it to maximum and just abort a query that takes too long by closing the connection..... will mariaDB insist on the crashrecovery or will it restart immediately. It's the crash recovery on restart is the biggest problem. A query or DDL command - no matter how poorly constructed - should not take down a production server for 24 hours. or do i need to choose something other than innodb? |
| Comment by Vladislav Vaintroub [ 2018-10-09 ] |
|
If you abort the query, MariaDB does not restart. If you can abort the query, that is. A example is e.g in And in the unlikely event that bug occurs still, in newer version, it would be nice , if you could add set core-file=1 in your my.ini [mysqld] section, so that if deadlock / crash occurs again, you could upload the mysqld.dmp from the data directory, and attach it to the bug report. This would help to analyze what happens better. |
| Comment by Nigel Gomm [ 2018-10-10 ] |
|
ok... will do and thank you. |
| Comment by Elena Stepanova [ 2018-11-07 ] |
|
nigelgomm, has the upgrade helped? |