[MDEV-12257] Slow queries with a query time of 601 seconds Created: 2017-03-14  Updated: 2017-05-15  Resolved: 2017-05-15

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1.21
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Markus Nägele Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: need_feedback
Environment:

CentOS release 6.8 (Final) amd64 on PowerEdge R510


Attachments: Text File DefinitionOfTable_jobhead.txt     Text File DefinitionOfTable_state.txt     Text File GlobalVariables.txt    

 Description   

Since we upgraded from version 10.0.16 to 10.1.21 at the end of January we're finding a certain statement several times a day in the slow query log.

The entry always looks like:

# Time: 170314 15:13:07
# User@Host: owstate[owstate] @  [172.16.185.21]
# Thread_id: 9417524  Schema: optovision  QC_hit: No
# Query_time: 601.125328  Lock_time: 0.000162  Rows_sent: 0  Rows_examined: 166088381
# Rows_affected: 0
SET timestamp=1489500787;
SELECT a.col_OWCOMMISSION, (
SELECT CONCAT(DATE_FORMAT(TIMESTAMP,"%d.%m.%Y %H:%i:%s"),'♣', STATUS)
FROM state
WHERE a.job=jobnr AND TIMESTAMP >= '2017-03-14 15:02:50' AND STATUS IN ('VB','D1','FR','GR','FP','ES','PAD','OR','QS','FA','AF','CT','YK','EK','FE','LS','VA')
ORDER BY TIMESTAMP DESC
LIMIT 1) AS X, JOB,a.col_OWEXTORDERNO,'',a.col_OWDELIVERYNOTE
FROM jobhead a
WHERE a.colACCN='240118' AND a.col_OWCOMMISSION IS NOT NULL
HAVING X IS NOT NULL;

The create statements of the involved tables state and jobhead are attached.

The table state is quite large, it has about 180 million rows (size on disk about 75G).
The table jobhead has about 180000 rows (about 350M on disk).

This statement is executed hundreds of thousands times a day and only differs in the wanted values for colACCN and timestamp ('240118' and '2017-03-14 15:02:50' in the example above).

While running MariaDB 10.0.16 we found it rarely in the slow query log (maybe once a week) with much lower query times (about 10 - 30 seconds). Since we switched to version 10.1.21 we find it in the slow query log 5 - 15 times a day and always with a query time of 601.xxxxxx seconds. Usually the statement needs some milliseconds to finish.

We noticed the variable innodb_fatal_semaphore_wait_threshold with a default value of 600 seconds an wonder if it is related with the issue.

Do you have any ideas what might cause this? How could we do a deeper analysis of this issue?

I attached the values of all variables as well. If any further information is helpful, please let me know.



 Comments   
Comment by Elena Stepanova [ 2017-04-17 ]

It is unlikely that innodb_fatal_semaphore_wait_threshold plays a role here, because that's when InnoDB is supposed to die, along with the server. Still, everything is possible, what does the error log say? If there is a long semaphore wait, there should be InnoDB status output there. Or, maybe there are other errors or warnings around the time when the problem occurs?

Generated at Thu Feb 08 07:56:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.