[MDEV-16046] All query hangs in processlist Created: 2018-04-27  Updated: 2019-02-17  Resolved: 2019-02-17

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.0.27
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: louis hust Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: need_feedback
Environment:

centos6.8 x86_64,


Attachments: File SGRDB-mysql01.err     JPEG File iostat.jpeg     File my3309.cnf     JPEG File processlist0.jpeg     JPEG File processlist1.jpeg     JPEG File processlist2.jpeg     JPEG File processlist3.jpeg     Text File show_engine_innodb.txt     Text File show_status.txt     HTML File stack    

 Description   

Suddenly, all queries hang on mariadb, and can not execute normally.

Attachments contains stack, processlist, err log, innodb status.

From engine innodb status:

We got the following information:

Dictionary memory allocated 11142855
Buffer pool size        1441791
Buffer pool size, bytes 23622303744
Free buffers            0
Database pages          1302517
Old database pages      480825
Modified db pages       81780
Percent of dirty pages(LRU & free pages): 6.279
Max dirty pages percent: 75.000
Pending reads 102

Free buffers 0, is it normally?

And from the stack, many threads hang on select function.

Is this a bug?



 Comments   
Comment by Elena Stepanova [ 2018-05-31 ]

svoj, marko, Elkin, thoughts?

This is wide, can be locking, can be innodb, can even be related to semi-sync somehow, I guess.

Comment by Sergey Vojtovich [ 2018-05-31 ]

FWICS clients connect from different host. According to the "stack" most threads stuck in poll/select. Can this be due to network problem?

Comment by Andrei Elkin [ 2018-05-31 ]

Who could say how handler::ha_rnd_next() ends up in calling select ():

#0  0x00007f6dd44a6623 in select () from /lib64/libc.so.6
#1  0x00000000009b9f0f in ?? ()
#2  0x0000000000a83404 in ?? ()
#3  0x0000000000a749cd in ?? ()
#4  0x0000000000a8670c in ?? ()
#5  0x0000000000a87163 in ?? ()
#6  0x0000000000a73903 in ?? ()
#7  0x0000000000a616e5 in ?? ()
#8  0x0000000000a07c21 in ?? ()
#9  0x0000000000a0a8e0 in ?? ()
#10 0x000000000096d4cb in ?? ()
#11 0x00000000006ea044 in handler::ha_rnd_next(unsigned char*) ()
#12 0x00000000007d7f3c in rr_sequential(READ_RECORD*) ()
#13 0x00000000007f4914 in mysql_delete(THD*, TABLE_LIST*, Item*, SQL_I_List<st_order>*, unsigned long long, unsigned long long, select_result*) ()
#14 0x0000000000598eb2 in mysql_execute_command(THD*) ()
#15 0x000000000059c921 in mysql_parse(THD*, char*, unsigned int, Parser_state*) ()

Comment by Andrei Elkin [ 2018-05-31 ]

As to the semisync involvement, messages like

180427 12:48:21 [Warning] Timeout waiting for reply of binlog (file: SGRDB-mysql01-bin.010453, pos: 520), semi-sync up to

appear in the log sporadically all the time, but there was always progress. I would downvote it as the main suspect.

Comment by louis hust [ 2018-06-01 ]

First i guess select() in handler::ha_rnd_next is the behavior of thread-pool, but check the configure file, not enabled thread pool. Maybe Async IO?

Comment by Elena Stepanova [ 2019-01-19 ]

louis.hust,

Are you still experiencing the problem?
It appears that you're building the binaries from source. Can you resolve the stack trace?

Comment by louis hust [ 2019-01-20 ]

hi, @elenst

Not experiencing the problem.

The stack file is in attachments. How to resolve it?

Comment by Elena Stepanova [ 2019-01-20 ]

You would need the same exact binary, but if you're not having the problem any longer, you've probably upgraded since then.
Here is another question – your error log shows that you have CONNECT engine installed, could this stack trace actually belong to it and not to plain InnoDB?

Comment by louis hust [ 2019-01-20 ]

The same exact binary means what?
Never used connect engine.

Comment by Elena Stepanova [ 2019-01-20 ]

The same exact mysqld binary which you used at the time when the stack trace was produced. Are you still running it? If you are, have you done something to make the problem disappear? Or was it a one-time occurrence?

Generated at Thu Feb 08 08:25:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.