[MXS-2562] Oracle's MySQL Connector/ODBC gets packets out-of-order errors with .NET Created: 2019-06-13  Updated: 2020-08-25  Resolved: 2019-07-01

Status: Closed
Project: MariaDB MaxScale
Component/s: Core, readwritesplit
Affects Version/s: 2.3
Fix Version/s: 2.3.9, 2.4.1

Type: Bug Priority: Major
Reporter: markus makela Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
is blocked by MXS-2563 Failing debug assertion at rwsplitses... Closed
Relates
relates to MDEV-19893 Do not send error packets with seqno= 0 Closed
relates to MXS-2157 Packet received out-of-order. Expecte... Closed
Sprint: MXS-SPRINT-85

 Description   

A user tried to use Oracle's MySQL Connector/ODBC in a .NET environment to communicate with MaxScale, and they saw errors like the following:

Packet received out-of-order. Expected 3; got 1.: MySqlProtocolException



 Comments   
Comment by Vladislav Vaintroub [ 2019-06-13 ]

Actually, several connectors, whether .NET or not, check the sequence number coming from the server. Our Connector/C or JDBC do not do this , but C/C used to, in the past.

Example https://github.com/PyMySQL/PyMySQL/issues/526 . Our server, on shutdown, sometimes sent an out-of-order error packet with seqno 0 to the client, and this caused some problems in Python.

Comment by markus makela [ 2019-06-27 ]

Managed to finally reproduce this with https://github.com/mysql-net/MySqlConnector and it is indeed a "bug" in the connector. It treats errors with mismatching sequence numbers as broken packets which prevent any and all reconnection functionality from working. Just skipping the sending of the error packet appears to solve it in the MaxScale case but it still fails when a server send an error on shutdown with a mismatching sequence number.

Comment by Vladislav Vaintroub [ 2019-06-27 ]

markus makela, does the server still send anything on shutdown? I thought I removed that stuff. It was well meant, but it is really better to close the socket, than to break other's apps.

Comment by markus makela [ 2019-06-27 ]

It still sends it at least with latest 10.3.

Comment by markus makela [ 2019-06-27 ]

On the other hand, if the server knows the sequence number it could simply use that. The problem with this probably lies in the case where the error would be received in a place where it wouldn't be expected.

Comment by markus makela [ 2019-06-27 ]

As a compromise solution, we can add a parameter into MaxScale that allows the user to control whether to send certain error messages generated internally by MaxScale. This should allow connectors that don't work with the current behavior to work while still retaining the problem resolution benefits that the extra errors provide.

Comment by Vladislav Vaintroub [ 2019-06-27 ]

I do not see how server could send anything on shutdown now. For unresponsive connections, it calls close_connection(thd), without errno, which skips net_send_error

Comment by Vladislav Vaintroub [ 2019-06-27 ]

markus makela, you could probably use seqno=1 instead of 0, always, except the very first "server hello"

Comment by markus makela [ 2019-06-27 ]

I was mistaken earlier that I did a test with 10.3 and in fact it was with a 10.2 docker image (10.2.23-MariaDB-1:10.2.23+maria~bionic-log) but re-testing with 10.3.15 shows that the problem is still there.

I posted my findings here on this mysql-net/MySqlConnector issue. You can find the network trace there that shows the exact error. This one in particular is quite interesting: it shows that the connector throws an exception even when the resultset is complete and the "expected" sequence number should in theory be 0.

Comment by markus makela [ 2019-06-27 ]

By always returning a sequence number of 1 for any errors that MaxScale generates will solve the case where no query is in progress for the client. There still appears to be a case were the result is interleaved with a error packet with the wrong sequence number:

Unhandled Exception: MySql.Data.MySqlClient.MySqlProtocolException: Packet received out-of-order. Expected 3; got 1.
   at MySqlConnector.Protocol.Serialization.ProtocolUtility.DoReadPayloadAsync(BufferedByteReader bufferedByteReader, IByteHandler byteHandler, Func`1 getNextSequenceNumber, ArraySegmentHolder`1 previousPayloads, ProtocolErrorBehavior protocolErrorBehavior, IOBehavior ioBehavior) in C:\projects\mysqlconnector\src\MySqlConnector\Protocol\Serialization\ProtocolUtility.cs:line 462
   at MySqlConnector.Protocol.Serialization.StandardPayloadHandler.ReadPayloadAsync(ArraySegmentHolder`1 cache, ProtocolErrorBehavior protocolErrorBehavior, IOBehavior ioBehavior) in C:\projects\mysqlconnector\src\MySqlConnector\Protocol\Serialization\StandardPayloadHandler.cs:line 37

Comment by Vladislav Vaintroub [ 2019-06-27 ]

not always returning seqno 1

send_error()
{
  if (seqno == 0)
   seqno=1
 send_error_internal()
}

something like this

Comment by Vladislav Vaintroub [ 2019-06-27 ]

I created and fixed MDEV-19893 "Do not send error packets with seqno= 0" for the server. I think it a is pretty straightforward fix. I suggest MaxScale would do something similar to that fix.

Comment by markus makela [ 2019-06-27 ]

Yup, we'll definitely change that in the next maintenance releases.

Comment by markus makela [ 2019-06-27 ]

I've found another interesting case where it can happen:

T 127.0.0.1:56220 -> 127.0.0.1:4006 [AP] #53127
  07 00 00 00 03 63 6f 6d    6d 69 74                   .....commit     
#
T 127.0.0.1:4006 -> 127.0.0.1:56220 [AP] #53128
  07 00 00 01 00 00 00 02    00 00 00                   ...........     
#
T 127.0.0.1:56220 -> 127.0.0.1:4006 [AP] #53129
  37 00 00 00 11 62 6f 62    00 14 86 0e 4b 26 f4 33    7....bob....K&.3
  f4 c7 16 64 e8 fa c1 64    36 5e fc af 3a a9 74 65    ...d...d6^..:.te
  73 74 00 2e 00 6d 79 73    71 6c 5f 6e 61 74 69 76    st...mysql_nativ
  65 5f 70 61 73 73 77 6f    72 64 00                   e_password.     
#
T 127.0.0.1:4006 -> 127.0.0.1:56220 [AP] #53130
  2b 00 00 01 fe 6d 79 73    71 6c 5f 6e 61 74 69 76    +....mysql_nativ
  65 5f 70 61 73 73 77 6f    72 64 00 3c 24 41 38 37    e_password.<$A87
  34 6a 23 4d 44 3e 43 2d    6a 5a 49 5f 29 40 3e       4j#MD>C-jZI_)@> 
#
T 127.0.0.1:4006 -> 127.0.0.1:56220 [AP] #53131
  57 00 00 01 ff 87 07 23    30 38 53 30 31 43 6f 6e    W......#08S01Con
  6e 65 63 74 69 6f 6e 20    6b 69 6c 6c 65 64 20 62    nection killed b
  79 20 4d 61 78 53 63 61    6c 65 3a 20 52 6f 75 74    y MaxScale: Rout
  65 72 20 63 6f 75 6c 64    20 6e 6f 74 20 72 65 63    er could not rec
  6f 76 65 72 20 66 72 6f    6d 20 63 6f 6e 6e 65 63    over from connec
  74 69 6f 6e 20 65 72 72    6f 72 73                   tion errors     

Seems that MaxScale doesn't set the correct sequence number when an error is generated while a COM_CHANGE_USER is in progress

Comment by Bradley Grainger [ 2019-06-27 ]

MySqlConnector client-side bug tracked here: https://github.com/mysql-net/MySqlConnector/issues/662

Generated at Thu Feb 08 04:15:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.