[CONJ-715] OOM happening when querying large datasets Created: 2019-06-27 Updated: 2022-02-15 Resolved: 2019-07-23 |
|
| Status: | Closed |
| Project: | MariaDB Connector/J |
| Component/s: | performance |
| Affects Version/s: | 2.2.1 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Arun Chaitanya Miriappalli | Assignee: | Diego Dupin |
| Resolution: | Not a Bug | Votes: | 1 |
| Labels: | None | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
I am using the following
I am currently using fetchsize of 1000 in JdbcTemplate. For large size tables, when using select queries, I am experiencing OOM. I could confirm that Spring is able to propagate the fetchSize to the driver. When I started debugging, I found the following observation in SelectResultSet in the driver code. When fetchSize > 0, the execution flow proceeds like
So essentially, SelectResultSet data variable keeps growing when next is called on the ResultSet. So OOM is ought to happen eventually. BTW, I am using the default TYPE_FORWARD_ONLY scroll type. Could anyone confirm if my understanding is correct? Is there a way to query large datasets without OOM? |
| Comments |
| Comment by Diego Dupin [ 2019-07-08 ] |
|
>So essentially, SelectResultSet data variable keeps growing when next is called on the ResultSet. So OOM is ought to happen eventually To explain by code : |
| Comment by Arun Chaitanya Miriappalli [ 2019-07-10 ] |
|
> If rows are huge, this can cause OOM. Then better set fetchsize to a lower value. > data maximum size will be limited to fetchsize. What I observed from our heapdump is dataSize > fetchSize. If my understanding is correct, the dataSize should never exceed fetchSize. Isn't it? |
| Comment by Arun Chaitanya Miriappalli [ 2019-07-23 ] |
|
I realized my mistake. This is happening because a new query is executed on the same connection/transaction which forced the previous resultset to be fully loaded in memory. I had to create a new transaction to run another query. Now I can iterate both the resultsets without causing OOM. It would be great if it were clearly mentioned clearly in fetchsize documentation. Otherwise, this issue can be closed. |
| Comment by Diego Dupin [ 2019-07-23 ] |
|
Alright. I didn't think of that case. I've added some documentation about this case and closing ticket |
| Comment by Cyril Joui [ 2022-02-15 ] |
|
Hello I'm getting the same issue with Kafka Connect JDBC Source. We have a large table to read with connector and we are facing OOM. // load active result if any so buffer are clean for next query Do you have any idea why calling isValid method has this impact ? STACK : |
| Comment by Diego Dupin [ 2022-02-15 ] |
|
This is expected : When streaming data, data are read row per row (or to be precise by bunch corresponding to fetch size). See the red paragraph in https://mariadb.com/kb/en/about-mariadb-connector-j/#streaming-result-sets :
The problem is if the same connection is used when streaming, to execute another command ( a Connection.isValid() in your case ), the connector will force the loading of streaming resultset, to read the new command result. The forced loading of streaming resultset might cause this OOM error. |
| Comment by Cyril Joui [ 2022-02-15 ] |
|
Thank you for your answer and documentation details. I'm also discussing with kafka jdbc connector to find the best fix. |