[MDEV-20671] mysql slow connect on localhost Created: 2019-09-26  Updated: 2020-10-02

Status: Open
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.1.41
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Fayland Lam Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: performance
Environment:

ubuntu 18.04



 Description   

background: it's 92G/20 CPU server with ubuntu 18.04

mysql Ver 15.1 Distrib 10.1.41-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2

issue:

root@z:# time mysql -uroot -pXXX -e "select now()"
+---------------------+
| now()               |
+---------------------+
| 2019-09-25 14:10:24 |
+---------------------+
 
real    0m22.267s
user    0m0.000s
sys     0m0.006s

it takes 20 seconds to connect mysql.

IMPORTANT: it's not always like that, it happens when we have more visits. we use php-fpm and not PERSISTENT=true for PDO since that's localhost connection, it uses sockets (so that's not TCP or network issue).

usually it takes 0.0X seconds.

iostat:

Linux 4.15.0-50-generic (fdb) 09/25/2019 x86_64 (20 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
48.22 0.00 17.26 0.79 3.08 30.65

Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn
loop0 0.00 0.00 0.00 8 0
sdb 0.09 0.27 2.20 24628 204340
sda 735.43 1740.89 5578.76 161421385 517281936
it seems alright.

mysql conf:

user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking
skip-host-cache
skip-name-resolve

bind-address = 0.0.0.0

key_buffer_size = 128M
max_allowed_packet = 32M
thread_stack = 192K
thread_cache_size = 32
wait_timeout = 60

myisam_recover_options = BACKUP
max_connections = 30000
max_connect_errors = 3000
#table_cache = 64
#thread_concurrency = 10
tmp_table_size = 512M
max_heap_table_size = 512M
table_open_cache = 4096

performance_schema = on

innodb_buffer_pool_size=48G
innodb_log_file_size=6144M
innodb_buffer_pool_instances=20
innodb_purge_threads=8
innodb_flush_log_at_trx_commit=2
we have skip-name-resolve so it's not DNS issue

mysql status sorry it's too large so I made it in gist: https://gist.github.com/fayland/97050f70cb258a41f1fa1bd3046df4d2

it would be great that someone can point us the direction to check. if you need any more info, we'll be glad to provide.

Thanks



 Comments   
Comment by Fayland Lam [ 2019-09-26 ]

innodb_thread_concurrency is 0

it happens suddenly and after a while it's all good.
real 0m0.006s

we checked full processlist, no locks. and show status, connections are not full as well.

how to debug that?

Thanks

Comment by Fayland Lam [ 2019-09-26 ]

it's not auth issue

ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)

real 0m8.228s
user 0m0.007s
sys 0m0.000s

if I do not type the password, it's still slow.

Comment by Fayland Lam [ 2019-09-26 ]

flag -A does not help as well. so that's not rehashing as well

# time mysql -uroot -pX -A -e "select now()"
+---------------------+
| now()               |
+---------------------+
| 2019-09-26 00:56:16 |
+---------------------+
 
real    0m19.847s
user    0m0.007s
sys     0m0.000s

Comment by Fayland Lam [ 2019-09-27 ]

I get the server removed. please close it. thanks

Comment by Elena Stepanova [ 2019-10-06 ]

fayland,

Did you get the server removed because you had given up on dealing with it, or because you had found an environmental problem? If it's the latter, the information may be beneficial for the community.

Comment by Fayland Lam [ 2019-10-07 ]

Hello. we changed the server (ubuntu 16.04 and mariadb 10.0.38) and it still happens.
let me know if you need more info.

Thanks

Comment by Fayland Lam [ 2019-10-20 ]

how to debug which part caused the slowness? I can duplicate the issue every day:

# time tcpdump -i any -s 0 -l -vvv -w - dst port 3306  | mysql -A -e "select now()"
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
now()
2019-10-20 12:16:15
716 packets captured
195583 packets received by filter
186039 packets dropped by kernel
 
real    0m8.604s
user    0m0.007s
sys 0m0.042s
# time tcpdump -i any -s 0 -l -vvv -w - dst port 3306  | mysql -A -e "select now()"
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
now()
2019-10-20 12:17:00
887 packets captured
32642 packets received by filter
24068 packets dropped by kernel
 
real    0m2.233s
user    0m0.008s
sys 0m0.008s
# time tcpdump -i any -s 0 -l -vvv -w - dst port 3306  | mysql -A -e "select now()"
now()
2019-10-20 12:17:05
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
470 packets captured
1101 packets received by filter
0 packets dropped by kernel
 
real    0m0.116s
user    0m0.006s
sys 0m0.026s

Comment by Daniel Black [ 2020-10-02 ]

Configuration suggestion

threadcache

| Threads_cached                                               | 32                                     |
| Threads_connected                                            | 2899                                   |
| Threads_created                                              | 8374381                                |

Increase your thread cache significantly - given 2900 are connection maybe 3000 is a good number. This will reduce the connection time by having a thread created in advance.

On tracing:

strace the server at the time of connection.

strace server

strace -fe trace=network -o /tmp/maradb-server.strace -p $(pidof mysqld)

Had bit on the server strace is going to be filtering the right connection thread. Look for the connection.

Tracing the client may see time delays at certain points in the protocol:

strace client

strace -o /tmp/mariadb-client.strace} -s 999 -- mysql -A -e "select now()"

Generated at Thu Feb 08 09:01:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.