[MXS-2562] Oracle's MySQL Connector/ODBC gets packets out-of-order errors with .NET Created: 2019-06-13 Updated: 2020-08-25 Resolved: 2019-07-01 |
|
| Status: | Closed |
| Project: | MariaDB MaxScale |
| Component/s: | Core, readwritesplit |
| Affects Version/s: | 2.3 |
| Fix Version/s: | 2.3.9, 2.4.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | markus makela | Assignee: | markus makela |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Sprint: | MXS-SPRINT-85 | ||||||||||||||||||||
| Description |
|
A user tried to use Oracle's MySQL Connector/ODBC in a .NET environment to communicate with MaxScale, and they saw errors like the following:
|
| Comments |
| Comment by Vladislav Vaintroub [ 2019-06-13 ] | ||||||||||||||||||||||||
|
Actually, several connectors, whether .NET or not, check the sequence number coming from the server. Our Connector/C or JDBC do not do this , but C/C used to, in the past. Example https://github.com/PyMySQL/PyMySQL/issues/526 . Our server, on shutdown, sometimes sent an out-of-order error packet with seqno 0 to the client, and this caused some problems in Python. | ||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-27 ] | ||||||||||||||||||||||||
|
Managed to finally reproduce this with https://github.com/mysql-net/MySqlConnector and it is indeed a "bug" in the connector. It treats errors with mismatching sequence numbers as broken packets which prevent any and all reconnection functionality from working. Just skipping the sending of the error packet appears to solve it in the MaxScale case but it still fails when a server send an error on shutdown with a mismatching sequence number. | ||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2019-06-27 ] | ||||||||||||||||||||||||
|
markus makela, does the server still send anything on shutdown? I thought I removed that stuff. It was well meant, but it is really better to close the socket, than to break other's apps. | ||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-27 ] | ||||||||||||||||||||||||
|
It still sends it at least with latest 10.3. | ||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-27 ] | ||||||||||||||||||||||||
|
On the other hand, if the server knows the sequence number it could simply use that. The problem with this probably lies in the case where the error would be received in a place where it wouldn't be expected. | ||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-27 ] | ||||||||||||||||||||||||
|
As a compromise solution, we can add a parameter into MaxScale that allows the user to control whether to send certain error messages generated internally by MaxScale. This should allow connectors that don't work with the current behavior to work while still retaining the problem resolution benefits that the extra errors provide. | ||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2019-06-27 ] | ||||||||||||||||||||||||
|
I do not see how server could send anything on shutdown now. For unresponsive connections, it calls close_connection(thd), without errno, which skips net_send_error | ||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2019-06-27 ] | ||||||||||||||||||||||||
|
markus makela, you could probably use seqno=1 instead of 0, always, except the very first "server hello" | ||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-27 ] | ||||||||||||||||||||||||
|
I was mistaken earlier that I did a test with 10.3 and in fact it was with a 10.2 docker image (10.2.23-MariaDB-1:10.2.23+maria~bionic-log) but re-testing with 10.3.15 shows that the problem is still there. I posted my findings here on this mysql-net/MySqlConnector issue. You can find the network trace there that shows the exact error. This one in particular is quite interesting: it shows that the connector throws an exception even when the resultset is complete and the "expected" sequence number should in theory be 0. | ||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-27 ] | ||||||||||||||||||||||||
|
By always returning a sequence number of 1 for any errors that MaxScale generates will solve the case where no query is in progress for the client. There still appears to be a case were the result is interleaved with a error packet with the wrong sequence number:
| ||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2019-06-27 ] | ||||||||||||||||||||||||
|
not always returning seqno 1
something like this | ||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2019-06-27 ] | ||||||||||||||||||||||||
|
I created and fixed | ||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-27 ] | ||||||||||||||||||||||||
|
Yup, we'll definitely change that in the next maintenance releases. | ||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-27 ] | ||||||||||||||||||||||||
|
I've found another interesting case where it can happen:
Seems that MaxScale doesn't set the correct sequence number when an error is generated while a COM_CHANGE_USER is in progress | ||||||||||||||||||||||||
| Comment by Bradley Grainger [ 2019-06-27 ] | ||||||||||||||||||||||||
|
MySqlConnector client-side bug tracked here: https://github.com/mysql-net/MySqlConnector/issues/662 |