[MDEV-6277] Query causes a deadlock Created: 2014-05-28 Updated: 2014-07-08 Due: 2014-07-07 Resolved: 2014-07-08 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 5.5.36 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Critical |
| Reporter: | Slawomir Pryczek | Assignee: | Elena Stepanova |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Linux FC11 |
||
| Description |
|
Hi Guys, im having a query that takes like 2-5 seconds to execute, but sometimes is causing a deadlock... Like 1 thread is "Sending data" indefinitely (>3 hours) and everything else is waiting in "Waiting for table metadata lock", this query would take only a couple of seconds, it's issued eg. 5000 times in short period of time (with very short wait between the queries) From spec i'm seeing this: Question is why all threads are waiting for "DDL" lock while no DDL operations are running (does lock table for write requires metadata lock?) (A) Here is the offending query, after killing it everything gets processed. (B) There is another query like this, this was issued after the first one... General states: Everything is waiting for metadata, one replication, one send thread. The questions are ... For reproducing, i can only send SHOW PROCESSLIST dump, because this happens like once every month, so no idea how more info about the issue can be extracted... Thanks as always... |
| Comments |
| Comment by Slawomir Pryczek [ 2014-05-29 ] |
|
I was able to fix it by issuing When it was at 400 - just one thread from 32 was at 100% CPU usage, with vmstat giving no read / writes ... is that possible that some OS limit prevented files from being open and mysql isn't reporting it in error log / dmesg? Or it may be some problem with table cache (as changing it caused the problem to perish)? Is there possibility to add some timeout for opening files? |
| Comment by Slawomir Pryczek [ 2014-05-29 ] |
|
Actually it's worse, after increasing number of files it seems that there is a deadlock: 77.93% mysqld [.] MDL_lock::visit_subgraph(MDL_ticket*, MDL_wait_for_graph_visitor*) Nothing gets processed... any advices on that? |
| Comment by Elena Stepanova [ 2014-06-01 ] |
|
Hi Slawomir, Regarding the last comment: For the previous occasions – does the error log say anything? Thanks. |
| Comment by Slawomir Pryczek [ 2014-06-03 ] |
|
Hi Elena... thanks for the response I'm using aria engine tables, there is no deadlock per-se, like innodb deadlock... just all threads are waiting in "opening tables" state, then they all, go into "waiting for metadata lock" state then after like 1h everything gets back to normal if we're lucky if not it'll stay like that for more time. What i realized, i killed all ariadb queries, one by one... and some processes just lingered in "Killed" state like for half hour. New processes went straight to table lock state, like opening tables was somehow locked globally I realized innodb queries seem to be processed at that time, so just like aria is locked up... The query takes about 1-2 seconds, rather small result set - 2k rows always (we process in 2k batches) The error is very hard to get... like the server is working ok for 1-2 days and then it locks suddenly for a couple of hours. Nothing in error log / dmesg. We simplified that create table and it helped a little, so this error doesn't appear that often. Is that possible something with aria engine is broken, like i don't know it's unable to process many threads waiting for locks? 77.93% mysqld [.] MDL_lock::visit_subgraph(MDL_ticket*, MDL_wait_for_graph_visitor*) Thanks... |
| Comment by Slawomir Pryczek [ 2014-06-03 ] |
|
Is that possible that innodb took some shared resources (like some mutex or file handles) and that made aria unable to process requests? |
| Comment by Slawomir Pryczek [ 2014-06-03 ] |
|
There are some warnings in error log, but these seem unrelated [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. INSERT... ON DUPLICATE KEY UPDATE on a table with more than one UNIQUE KEY is unsafe Statement |
| Comment by Elena Stepanova [ 2014-06-07 ] |
|
Hi Slawomir,
Please do, you can attach it to this report, or upload to the private section of our ftp (ftp.askmonty.org/private), whichever you prefer. Do you happen to also have SHOW STATUS and/or SHOW GLOBAL STATUS in your monitoring? If you do, please attach it is well. A generalized answer to your questions of the "is it possible" kind is that what you observe shouldn't be happening, but everything is possible if there is a bug, so it would be irresponsible to say a definite "no" without deeper investigation.
Have you kept the new value? Did the problem ever re-appear with this new value?
Which number of files did you increase?
You can try to decrease the value of lock_wait_timeout (by default it's 1 year). Set it to lets say 1 hour, or 10 min, or 1 min, or whichever you think is reasonable with your data. It will mean that whatever is waiting for any table lock will hit the timeout and fail with the timeout error, thus the deadlock, unless it's a true permanent deadlock, should resolve itself. If having some queries fail with the timeout suits your workflow, you can use it as a workaround. |
| Comment by Elena Stepanova [ 2014-07-08 ] |
|
Closing as incomplete for now, if you have additional information, please comment to re-open. |