[ODBC-68] [ma-2.0.13][5.6.10]MySQL server has gone away error with larger data sets Created: 2016-11-18  Updated: 2017-08-22  Resolved: 2017-05-19

Status: Closed
Project: MariaDB Connector/ODBC
Component/s: None
Affects Version/s: 2.0.12
Fix Version/s: 3.0.1, 2.0.15

Type: Bug Priority: Major
Reporter: Rahul Dhuvad Assignee: Lawrin Novitsky
Resolution: Fixed Votes: 0
Labels: aurora, odbc
Environment:

Windows 64-bit and 32-bit, but only from a VM


Attachments: File MariaDB_MySQLServerGoneAway.7z     Text File SQL01.txt     Text File StandAloneRepro.cpp     File StandAloneRepro.out     File mariadb-connector-odbc-2.0.15-win32.msi     File mariadb-connector-odbc-2.0.15-win64.msi     PNG File screenshot-1.png    
Issue Links:
Duplicate
is duplicated by ODBC-96 Lost connection to MySQL server error... Closed

 Description   

We've a table with about 41 columns in Aurora database with varieties of data types. There are only 12 rows in the table; however row-size in few of the row is really large (500MB+). Total data-size is roughly about 1.7GB here (12 rows X 41 columns).

Same application works fine if I switch to MySQL ODBC driver (any latest version); however with MariaDB ODBC driver I see "MySQL server has gone away" error. Sometime we had also seen "Connection is lost" message in the same case.

I'm attaching table schema and ODBC traces of both failing MariaDB ODBC driver as well passing MySQL ODBC driver for your reference.



 Comments   
Comment by Lawrin Novitsky [ 2017-03-08 ]

The "server is gone" is happening at the execution of INSERT into ALLDATATYPES_AURORA_TGT, that happens after SELECT from ALLDTYPES_BOUNDARY, for which you provided the structure. But that's not a big problem.
Is your application threaded? I am asking because in the trace you gave, I have noticed something strange(as for single threaded program). There is call of SQLFreeHandle(of the STMT handle used for aforementioned SELECT), then comes call of SQLPrepare of that INSERT, and only then SQLFreeHandle returns. And not only that, actually - this is just one example. And both stmt handles(for insert and select) were allocated on the same connection.
Another strange thing is that SQLFreeHandle call is redundant - right before it SQLFreeStmt(SQL_DROP) had been already called on the same handle. And these calls are equivalent. But strange here is, that SQLFreeHandle, in such case, should have SQL_INVALID_HANDLE returned. But it returned SQL_SUCCESS. Maybe there was a bug here in 2.0.12, but I would be surprised.
So, if the application is threaded, then I guess that is the reason of the "server is gone", and not size of fields. I have reasons to think that could have been fixed since 2.0.12. Would it be possible to verify that? Which connector should I provide here Windows or Linux 64bit?

Comment by Rahul Dhuvad [ 2017-03-09 ]

Hi Lawrin,

I'm assuming for SQLFreeHandle, you are talking about line#5371-5380 (correct?):

[ODBC][39207][1479448804.699207][SQLFreeHandle.c][385]
		Entry:
			Handle Type = 3
			Input Handle = 0x7f42d001a110
[ODBC][39207][1479448804.699337][SQLPrepare.c][196]
		Entry:
			Statement = 0x7f42bc08d790
			SQL = [INSERT INTO aurorasqoop.ALLDATATYPES_AURORA_TGT(COL_BIT,COL_TINYINT_UNSIGNED,COL_TINYINT_SIGNED,COL_BOOLEAN,COL_SMALLINT_UNSIGNE...][length = 757]
[ODBC][39207][1479448804.957118][SQLFreeHandle.c][497]
		Exit:[SQL_SUCCESS]

Our application is single threaded only. Good observation however. I'm not sure reason for this interleaving traces.

You mentioned about both INSERT and SELECT is part of the same connection handle; is that an issue?

We are already on MariaDB ODBC driver/connector 2.0.13 now and continue to see the issue. In fact, we now have a customer situation using our Aurora database via MariaDB ODBC driver. They reported hang issues, however our in-house testing is seeing "Error [ FnName: Execute – [ma-2.0.13][5.6.10]Lost connection to MySQL server during query].” at different stages. Would you be interested in having ODBC traces for that – need to check the feasibility as we've over million rows and complex Informatica mappings.

Comment by Lawrin Novitsky [ 2017-03-09 ]

Thanks.
> You mentioned about both INSERT and SELECT is part of the same connection handle; is that an issue?

No, that is not an issue. I just saw it as another evidence, that application is probably threaded.

>Would you be interested in having ODBC traces for that

Yes, we are interested in all problems traces. But please, in a separate report.
p.s. Still not clear why SQLFreeHandle didn't return SQL_INVALID_HANDLE

Comment by Lawrin Novitsky [ 2017-03-10 ]

Do you use same server for mysql and mariodb connectors test? What is size of max_allowed_packet?
Could you please try to make it larger? Like 500+M.
Interesting though, that I don't remember parameter buffers to be that big for the query causing disconnect

Comment by Rahul Dhuvad [ 2017-03-10 ]

Yes, it's same Aurora database we use to connect and both MariaDB and MySQL ODBC drivers/applications are hosted on the same machine. We have not tuned any parameters. Can you suggest how to set max_allowed_packet from a ODBC application? We are using DSN-less connections, so enabling any property via DSN may not help us.

Comment by Lawrin Novitsky [ 2017-03-10 ]

No, that is server setting. You can either give --max_allowed_packet=XXXM parameter when starting server, or add max_allowed_packet=XXXM to [mysqld] section of my.cnf
If the user used for odbc tests have such privilege, you can run smth like
set @@global.max_allowed_packet=1024*1024*512
query. It can't be changed for the session. But you will have it changed for the next tests run.

Comment by Rahul Dhuvad [ 2017-03-13 ]

Thanks. I'm working with our QA to see if this setting can be changed with Aurora database that we are connecting to. I'll update you once I've details.

Comment by Rahul Dhuvad [ 2017-03-24 ]

I'm able to reproduce this with a standalone ODBC program. Issue is seen when either MySQL or Aurora database is very remote (in my case, ODBC application is in India machine, whereas database is in US region). With MySQL ODBC driver, we never hit the reported issue; however with MariaDB ODBC driver we always hit it on some Windows boxes we have. I was able to narrow down to the target table schema as:

CREATE TABLE `lostconn_tgt` (
`col_text` text,
`col_mediumblob` mediumblob,
`col_mediumtext` mediumtext,
`col_longblob` longblob,
`col_longtext` longtext
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

StandAloneRepro.cpp StandAloneRepro.out

Comment by Lawrin Novitsky [ 2017-03-29 ]

1.0 and 3.0 connectors are not vulnerable. 3.0 is kinda unexpected here.
Also I guess we can assume, that the problem is in the underlying C/C library

Comment by Lawrin Novitsky [ 2017-03-29 ]

One more thing - I added to the testcase dropping, creating and dropping of the table(and removed truncating). And connection gets lost either on dropping, or on creating table, or on insert execution.

Comment by Rahul Dhuvad [ 2017-03-29 ]

@novitsky, do you mean you are now able to reproduce the issue?

Comment by Lawrin Novitsky [ 2017-03-29 ]

Rahul Dhuvad, yes. But it hasn't made anything clear. Rather the other way around

Comment by Rahul Dhuvad [ 2017-04-23 ]

Hi Lawrin,

During our upgrade to use 2.0.13 (from before 2.0.11) MariaDB ODBC driver, our customer is hitting this issue during "TRUNCATE TABLE..." command execution. We're seeing "Lost Connection" issue. Attaching the ODBC trace for the same. We're not able to hit the issue in-house though. After moving back to 2.0.11, customer confirms this issue goes away. Can you please check if anything suspicious fixed incorrectly between 2.0.11 and 2.0.13? Customer was consistently seeing this error. SQL01.txt

Thanks,
Rahul.

Comment by Rahul Dhuvad [ 2017-04-25 ]

Did you get a chance to check on this? Customer had to rollback to 2.0.11; however as we know that version has other issues like array fetches. Any workarounds like any connection configuration etc we can try, let us know that as well.

Comment by Lawrin Novitsky [ 2017-05-02 ]

Sorry for delay with answer - i had mix of sickness, vacation and traveling.

I think this has to be a separate report/issue. Unless it's only repeatable in the VM with remote server.

I couldn't repeat the issue both with 2.0.13 and 2.0.14(http://downloads.mariadb.com/Connectors/odbc/connector-odbc-2.0.14/) . So, maybe it has something to do with this issue. Could you please verify with customer, if they hit the issue running their application in a VM? Is it on Windows? What is the server version?

Also I would appreciate if your customer tried with 2.0.14.

Comment by Lawrin Novitsky [ 2017-05-18 ]

We think we might have a solution for the problem. I have attached msi's here. This is not official release yet, but I'd appreciate if you could verify that. I actually will try, too. I forgot I could re-create it

Comment by Lawrin Novitsky [ 2017-05-18 ]

Hmm, looks like I cannot re-create the issue any more. What is strange. So I can't really verify if the fix really works.
One more thing, as I probably said, the problem wasn't really in the C/ODBC. That's why I did not try the fix yet.

Comment by Rahul Dhuvad [ 2017-05-19 ]

I tried with the attached 2.0.15 win64 msi and the latest issue reported on "2017-04-23 16:17" by me above seems have fixed! I ran our Informatica task which was consistently failing with either "Lost connection" or "MySQL server has gone away" are working fine. Below is the screenshot from our Activity log of Informatica:

I'll provide this patch to our QA to verify original issue reported to this Jira as well – in case that (with huge row-data) gets fixed as well. I'll update you soon.

Comment by Lawrin Novitsky [ 2017-05-19 ]

I am closing it by now. I want to make a release and have this issue on the release notes.
Btw, I guess we can also close "truncate" issue. I could not re-create it in "normal" conditions, so I guess it's just a duplicate of this one.

Comment by Rahul Dhuvad [ 2017-05-20 ]

We were facing few related issues with the same error (either "Lost connection" or "MySQL server has gone away"):
1. Table having few rows (12) but with huge data in its LOB columns (1.7GB dataset size).
2. Truncate table operation seeing this error at our customer site.
3. Table having millions of rows and normal data in its columns.

We found that with the latest patch you provided (2.0.15), first two issue seems resolved. However, our QA reported 3rd issue still exists; unfortunately enabling ODBC trace do not hit the issue. In summary, the provided patch certainly seems improving the quality for us; but looks like still some cases where issue being observed. I'll probably open another Jira to track that with more details on that. Thank you for the current fix!

Comment by Lawrin Novitsky [ 2017-05-21 ]

This particular bug report is about running for a VM problems and getting disconnected quite randomly. When I could recreate it, I was getting connection lost at least at 3 different places of the same short test-cate. This problem I believe is solved now. The problem with millions rows probably has some different nature are conditions. And thus it's a good idea to have a separate report for it.
I would imagine that TRUNCATE bug is a duplicate of this one. I could not repeat it in "normal" conditions. Thus I guess it is also related to running from a VM. Could you please verify that? So far I am gonna close that other bug as a duplicate.

Comment by Rahul Dhuvad [ 2017-05-22 ]

Sounds good. Truncate bug is something seen consistently with our customers; for us it's 20% chance to see issue during truncate operation; remaining 80% is doing other select/insert operation of the application. We are trying to see if your private patch can be provided to the customer to try out the fix.

Comment by Lawrin Novitsky [ 2017-05-22 ]

If nothing goes wrong, public release will follow soon

Comment by Tomas [ 2017-08-22 ]

Which commit fixes the issue?

Comment by Lawrin Novitsky [ 2017-08-22 ]

The problems & the fix were in Connector/C. Do you still need commit id for it?

Comment by Tomas [ 2017-08-22 ]

I see. All good, thanks, I was wondering, since I couldn't see anything related to that in mariadb-connector-odbc repo

Generated at Thu Feb 08 03:25:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.