[CONJ-125] Optimize cached ResultSet memory footprint Created: 2014-12-03  Updated: 2016-03-31  Resolved: 2016-03-31

Status: Closed
Project: MariaDB Connector/J
Component/s: Other
Affects Version/s: 1.1.7
Fix Version/s: 1.4.0

Type: Task Priority: Minor
Reporter: Lennart Schedin Assignee: Diego Dupin
Resolution: Fixed Votes: 1
Labels: None

Attachments: Java Source File BigResultSetMemoryOptimization.java    
Issue Links:
Duplicate
is duplicated by CONJ-26 [Feature Request] Implement configura... Closed
is duplicated by CONJ-173 [Feature Request] Implement a mechani... Closed
Sprint: Sprint connector/j 1.3.0

 Description   

How to reproduce:

  1. Download the attached test case BigResultSetMemoryOptimization.java
  2. Use Oracle Java (perhaps also the JDK is required, I’m not sure)
  3. Run the test case (for example in Eclipse) as is to get a heap dump for 100K rows of ResultSet using MariaDB JDBC
  4. Edit the test case to use Mysql JDBC (see instructions in the comment in the file)
  5. Run the test case to get a heap dump using Mysql JDBC
  6. Open the two heap dumps side by side in Eclipse Memory Analyzer https://eclipse.org/mat/ (I recommend to download the standalone version).
  7. Run the Leak Suspects Report to get values for the “Retained Heap” column
  8. Open Histogram view (the second icon).
  9. Sort on the “Retained Heap” column.
  10. Select the row that represents the main java object holding the memory In MariaDB it is org.mariadb.jdbc.internal.common.queryresults.CachedSelectResult, in Mysql it is com.mysql.jdbc.JDBC4ResultSet.

Actual:
MariaDB is using 41 226 960 bytes distributed on about 1,5 M objects
Mysql is using 26 829 400 bytes distributed on about 0,9 M objects

In other words: MariaDB is using 53 % more memory and 67 % more objects for this specific table pattern.

Expected:
For MariaDB JDBC to be a drop in replacement for Mysql I would expect that the memory foot print (in terms of bytes and objects) to be about equivalent. Would it be possible to optimize this?

I have run into memory problems for this exact table structure but with 5,7M rows. I know about the stream ResultSet feature and that might be a workaround for me in this case, but so far I have not gotten it to work, perhaps due to some bug in either my code or in MariaDB JDBC. If I find a bug in MariaDB ResultSet stream feature I will write a separate ticket for this.



 Comments   
Comment by Massimo Siani (Inactive) [ 2014-12-10 ]

Hi Lennart,
not sure whether you already tried, but adding

statement.setFetchSize(Integer.MIN_VALUE);

decreases the memory need by a factor of 100 (uses the stream ResultSet feature). Maybe it doesn't solve your problem, though.

In any case, I'm investigating how to decrease the memory need for CachedResultSet objects. Let me share that, as far as I was able to understand, it seems that the problem is due to the large number of MySQLValueObject objects. I guess it's worth spending some time to understand if we can avoid such object.

Any comment is very welcomed!

Comment by Lennart Schedin [ 2014-12-11 ]

Yes I know about the ResultSet stream feature (setFetchSize(Integer.MIN_VALUE)). But I did not get that to work (because of either a bug in my application code or in MariaDB JDBC). I will write a ticket if I find any specific bug in MariaDB JDBC.

For the specific problem with CachedResultSet I currently don't have any implementation proposal. The only thing I have a high level idea: only cache the binary data received for each row. Then as the application code calls the resultSet.next() method each binary data row is converted into the MySQLValueObject-objects data structure. Each subsequent call to resultSet.next() can overwrite the previous MySQLValueObject-objects data structure. With this idea only one row at a time would be “inflated” into Java objects at a time. All the other rows could remain in a byte array. Alternative solution is to inflate only 1, 5, 10, 100 or 1000 (etc) rows at a time.

I suspect that it could take same time and will require a large refactoring of the code. I would estimate the risk of fixing this ticket to be high (it may cause other bugs).

Comment by Massimo Siani (Inactive) [ 2014-12-11 ]

Could you refactor your code where you can't use the stream feature so that we can add a test case for your real case?
Would you be interested in that?

Comment by Diego Dupin [ 2016-03-31 ]

test result show now 26m footprint with version1.4.0

Generated at Thu Feb 08 03:13:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.