[MXS-47] Session freeze when small tail packet Created: 2015-03-17  Updated: 2015-03-27  Resolved: 2015-03-19

Status: Closed
Project: MariaDB MaxScale
Component/s: mariadbbackend
Affects Version/s: 1.0.4
Fix Version/s: 1.1.0

Type: Bug Priority: Major
Reporter: Yves Trudeau Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None
Environment:

Ubuntu precise 64bits



 Description   

In our environment, we experienced random freezes and by drilling down, we found one specific query whose resultset size was very close to the MTU and was sometimes sent in 2 chunks. After some tests, adding or removing bytes to columns title, it appears that when the second chunk (the tail), is less than 5 bytes, it freezes while if it is more, it is ok. Here's a debug log of a freeze:

2015-03-17 14:54:09 139839053664384 [poll_waitevents] Read in dcb 0x1dfa420 fd 22
2015-03-17 14:54:09 139839053664384 [gw_read_backend_event] Read dcb 0x1dfa420 fd 22 protocol state 7, MYSQL_IDLE.
2015-03-17 14:54:09 139839053664384 [dcb_read] Read 1448 bytes from dcb 0x1dfa420 in state DCB_STATE_POLLING fd 22.
2015-03-17 14:54:09 139839053664384 [protocol_get_srv_command] Read command UNKNOWN MYSQL PACKET TYPE for fd 22.
2015-03-17 14:54:09 139839053664384 [dcb_write] Wrote 1448 Bytes to dcb 0x1dec390 in state DCB_STATE_POLLING fd 21
2015-03-17 14:54:09 139839053664384 [poll_waitevents] epoll_wait found 1 fds
2015-03-17 14:54:09 139839053664384 [poll_waitevents] event 5 dcb 0x1dfa420 role DCB_ROLE_REQUEST_HANDLER
2015-03-17 14:54:09 139839053664384 [gw_write_backend_event] wrote to dcb 0x1dfa420 fd 22, return 1
2015-03-17 14:54:09 139839053664384 [poll_waitevents] Read in dcb 0x1dfa420 fd 22
2015-03-17 14:54:09 139839053664384 [gw_read_backend_event] Read dcb 0x1dfa420 fd 22 protocol state 7, MYSQL_IDLE.
2015-03-17 14:54:09 139839053664384 [dcb_read] Read 4 bytes from dcb 0x1dfa420 in state DCB_STATE_POLLING fd 22.
2015-03-17 14:54:12 139839053664384 [poll_waitevents] epoll_wait found 1 fds
2015-03-17 14:54:12 139839053664384 [poll_waitevents] event 1 dcb 0x1de0690 role DCB_ROLE_SERVICE_LISTENER
2015-03-17 14:54:12 139839053664384 [poll_waitevents] Accept in fd 12
2015-03-17 14:54:12 139839053664384 [gw_MySQLAccept] Accepted fd 23.

The "Read 4 bytes" is the small tail. When it doesn't freeze, it looks like:

2015-03-17 14:54:08 139839053664384 [readconnroute:routeQuery] Routed command 3 to dcb 0x1dfa420 with return value 1.
2015-03-17 14:54:08 139839053664384 [poll_waitevents] epoll_wait found 1 fds
2015-03-17 14:54:08 139839053664384 [poll_waitevents] event 5 dcb 0x1dfa420 role DCB_ROLE_REQUEST_HANDLER
2015-03-17 14:54:08 139839053664384 [gw_write_backend_event] wrote to dcb 0x1dfa420 fd 22, return 1
2015-03-17 14:54:08 139839053664384 [poll_waitevents] Read in dcb 0x1dfa420 fd 22
2015-03-17 14:54:08 139839053664384 [gw_read_backend_event] Read dcb 0x1dfa420 fd 22 protocol state 7, MYSQL_IDLE.
2015-03-17 14:54:08 139839053664384 [dcb_read] Read 1452 bytes from dcb 0x1dfa420 in state DCB_STATE_POLLING fd 22.
2015-03-17 14:54:08 139839053664384 [protocol_get_srv_command] Read command UNKNOWN MYSQL PACKET TYPE for fd 22.
2015-03-17 14:54:08 139839053664384 [dcb_write] Wrote 1452 Bytes to dcb 0x1df9a00 in state DCB_STATE_POLLING fd 21
2015-03-17 14:54:08 139839053664384 [poll_waitevents] epoll_wait found 1 fds
2015-03-17 14:54:08 139839053664384 [poll_waitevents] event 8197 dcb 0x1df9a00 role DCB_ROLE_REQUEST_HANDLER
2015-03-17 14:54:08 139839053664384 [poll_waitevents] Read in dcb 0x1df9a00 fd 21

Looking at the code, I saw at few places conditions like:

if (nbytes_read < 5)
{

I suspect something is wrong with one of these.

--Yves



 Comments   
Comment by Massimiliano Pinto (Inactive) [ 2015-03-18 ]

Maybe the wrong behaviour here is similar to the LOAD DATA LOCAL INFILE failure with Read connection router that was fixed in 1.0.5

In the develop branch that fix is available, would you mind having a try?

Massimiliano

Comment by Yves Trudeau [ 2015-03-18 ]

Ok, I give it a try.

Regards,

Yves

Le Wed, 18 Mar 2015 10:56:00 +0200 (EET),

Comment by Yves Trudeau [ 2015-03-18 ]

No luck,

maxscale-1.0.5-unstable with the latest commit as 4cceaa84e
still presents the issue.

Le Wed, 18 Mar 2015 13:44:00 +0200 (EET),

Comment by Yves Trudeau [ 2015-03-18 ]

After some tests, I believe I know what's happening. The following tests in gw_read_backend_event is for short packets:

{{
/** This may be either short prefix of a packet, or the tail of it. */
else
{
if (nbytes_read < 5)

{ dcb->dcb_readqueue = gwbuf_append(dcb->dcb_readqueue, read_buffer); rc = 0; goto return_rc; }

}}
Maybe it is ok for short prefixes but it clearly doesn't work for short tails. If I skip the if (nbytes_read < 5), I stop having freezes.

--Yves

Comment by markus makela [ 2015-03-19 ]

We've implemented a fix for this in the mxs-47-fix branch on GitHub. If you can, please test this branch and report if it fixes your issues with sessions freezing.

Comment by Timofey Turenko [ 2015-03-19 ]

before fix this freezes, but now it does not freeze:

for (int i = 1; i < 50000; i++)

{ sprintf(str, "SELECT REPEAT('a',%d)", i); execute_query(Test->conn_rwsplit, str); execute_query(Test->conn_master, str); execute_query(Test->conn_slave, str); }

Can we close the bug?

Generated at Thu Feb 08 03:56:23 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.