[MXS-1734] Packet received out-of-order. Expected 1; got 0 Created: 2018-03-21  Updated: 2018-05-02  Resolved: 2018-05-02

Status: Closed
Project: MariaDB MaxScale
Component/s: readwritesplit
Affects Version/s: 2.2.4
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: markus makela Assignee: markus makela
Resolution: Not a Bug Votes: 0
Labels: None

Attachments: PNG File Screen Shot 2018-03-21 at 11.33.53.png    
Sprint: MXS-SPRINT-57

 Description   

Following error is logged with .NET connector:

Packet received out-of-order. Expected 1; got 0



 Comments   
Comment by markus makela [ 2018-03-21 ]

Possible culprit:

        CHK_SESSION(session);
        if (session->state != SESSION_STATE_DUMMY && !session_valid_for_pool(session))
        {
            // The client did not send a COM_QUIT packet
            modutil_send_mysql_err_packet(dcb, 0, 0, 1927, "08S01", "Connection killed by MaxScale");
        }

Comment by markus makela [ 2018-04-30 ]

Another explanation would be the retry_failed_reads feature. If a connection to a slave was lost when in the middle of a large resultset, the query could be retried by readwritesplit if autocommit is enabled and there is no open transaction. To verify this, add retry_failed_reads=false to readwritesplit.

Comment by markus makela [ 2018-05-02 ]

Could be caused by MXS-1846. The packet sequence numbers would match to the one returned in this case.

Comment by Wagner Bianchi (Inactive) [ 2018-05-02 ]

Markus,

Using the Microsoft Azure and in an attempt to have the ACTIVE/PASSIVE for Maxscale together with their internal load balancer, we added an additional layer on the topology to provide an automatic failover for Maxscale (we cannot use keepalived due to a lack of a VIP). Having said that, we setup LB->HAProxy->Maxscale, when HAPorxy was a routing point to send queries to a given Maxscale A or B, implementing though the Hot-Standby protocol we need.

With the error appearing when we put this in production for the first time, we rolled back to a previous scenario and isolated components, testing each scenario for 24 hours:

1. applications accessing Maxscale->Backends;
2. applications accessing Azure Internal LB->Maxscale->Backends;
3. applications accessing HAProxy-> Maxscale->Backends;

The error appeared just when using the scenario #3, that is, the problem was with HAProxy. we used the latest version, downgraded one version, removed the tcpack, but, the error was still happening. We moved everything to use Corosync/Pacemaker, being the VIP the Azure LB IP.

We can close this issue as it's not related to Maxscale.

Thanks, Markus.

Generated at Thu Feb 08 04:09:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.