Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
6.4.2, 6.4.4, 22.08.3
-
None
Description
Query takes longer than wait_timeout value on backend server got disconnected on client by maxscale while query is still running on backend server.
Observations:
1) clients connects to master and slave.
2) once master's session reached the wait_timeout(120s), client connection got disconnected from master and slave, however, session on slave did not get disconnected, but kept running.
In above, user=allen is the client user connection through maxscale.
MaxScale threw the following error:
2022-12-12 14:20:29 info : (10) [readwritesplit] (ReadWriteSplitService); Master 'server1' failed: #HY000: Lost connection to backend server: network error (server1: 104, Connection reset by peer) |
2022-12-12 14:20:29 error : (10) [readwritesplit] (ReadWriteSplitService); Lost connection to the master server, closing session. Lost connection to master server while connection was idle. Connection has been idle for 120 seconds. Error caused by: #HY000: Lost connection to backend server: network error (server1: 104, Connection reset by peer). Last close reason: <none>. Last error: |
2022-12-12 14:20:29 info : (10) Stopped ReadWriteSplitService client session [10] |
At the same time, on mariab master's SQL error log also reported the following error, but not on slave as that session lived still on slave server.
2022-12-11 21:20:29 allen[allen] @ [192.168.254.29] ERROR 1159: Got timeout reading communication packets : (null) |
The followings are the timeout related config from mariadb server:
innodb_flush_log_at_timeout=1 |
innodb_lock_wait_timeout=180 |
innodb_rollback_on_timeout=OFF
|
interactive_timeout=28800 |
lock_wait_timeout=10800 |
net_read_timeout=30 |
net_write_timeout=60 |
rpl_semi_sync_master_timeout=10000 |
rpl_semi_sync_slave_kill_conn_timeout=5 |
slave_net_timeout=10 |
thread_pool_idle_timeout=60 |
wait_timeout=3600 |
idle_readonly_transaction_timeout=0 |
idle_transaction_timeout=0 |
idle_write_transaction_timeout=0 |
delayed_insert_timeout=300 |
deadlock_timeout_long=50000000 |
deadlock_timeout_short=10000 |
connect_timeout=10 |
Interestingly, user confirmed that this does not happen in 2.5.20 and I could confirmed the same.