Uploaded image for project: 'MariaDB Connector/J'
  1. MariaDB Connector/J
  2. CONJ-26

[Feature Request] Implement configurable fetch size and fetch direction for Statement/ResultSet

Details

    • Epic
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Fixed
    • 1.1.0
    • 1.4.0
    • Other
    • None
    • Implement multiple row fetching
    • Sprint connector/j 1.3.0

    Description

      As discussed earlier, currently fetch size is "one or all". It would be good to implement it fully.
      Also, setFetchDirection for ResultSet and Statement are currently just stubs.

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            wlad Vladislav Vaintroub made changes -
            Fix Version/s jdbc-1.1.2 [ 12801 ]
            Fix Version/s jdbc-1.1.1 [ 12500 ]

            I think fetch direction can safely be ignored. after looking again at spec. this is a hint to the driver, which has no visible effects during runtime (i.e ResultSet.next() will still move forward, and previous() will move backward). We cannot use this hint - to move backwards with ResultSet.previous(), we have to read and cache the whole result, no way around it.

            wlad Vladislav Vaintroub added a comment - I think fetch direction can safely be ignored. after looking again at spec. this is a hint to the driver, which has no visible effects during runtime (i.e ResultSet.next() will still move forward, and previous() will move backward). We cannot use this hint - to move backwards with ResultSet.previous(), we have to read and cache the whole result, no way around it.
            elenst Elena Stepanova made changes -
            Assignee Vladislav Vaintroub [ wlad ] Georg Richter [ georg ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Workflow defaullt [ 26218 ] MariaDB v2 [ 47813 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Workflow MariaDB v2 [ 47813 ] MariaDB connectors [ 54896 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Workflow MariaDB connectors [ 54896 ] MariaDB v3 [ 70154 ]
            diego dupin Diego Dupin made changes -
            Assignee Georg Richter [ georg ] diego dupin [ diego dupin ]
            diego dupin Diego Dupin made changes -
            Fix Version/s 1.2.1 [ 19602 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 1.1.2 [ 12801 ]
            diego dupin Diego Dupin made changes -
            julien.fritsch Julien Fritsch made changes -
            julien.fritsch Julien Fritsch made changes -
            julien.fritsch Julien Fritsch made changes -
            Issue Type Task [ 3 ] Epic [ 5 ]
            julien.fritsch Julien Fritsch made changes -
            Epic Child CONJ-138 [ 50119 ]
            diego dupin Diego Dupin made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            diego dupin Diego Dupin made changes -
            Epic Name Implement multiple row fetching
            Sprint Sprint 1 [ 11 ]
            diego dupin Diego Dupin made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            diego dupin Diego Dupin made changes -
            Fix Version/s 1.4.0 [ 19606 ]
            Fix Version/s 1.3.0 [ 19602 ]
            bazzip Paolo Bazzi added a comment -

            +1 for solving this issue....
            We run into a similiar problem while reading a large set of data from a MariaDB and were forced to switch to the streamed mode, since we ran into OutOfMemory exceptions when fetching all records at once (~8m records). Other JDBC drivers (like Oracle) only fetch the records, when iterating over the result set and therefore require a lot less of memory if the records are processed in a loop and then discarded.
            It would be great to support a configurable fetch size instead of force the user to decide wether to read all or nothing.

            bazzip Paolo Bazzi added a comment - +1 for solving this issue.... We run into a similiar problem while reading a large set of data from a MariaDB and were forced to switch to the streamed mode, since we ran into OutOfMemory exceptions when fetching all records at once (~8m records). Other JDBC drivers (like Oracle) only fetch the records, when iterating over the result set and therefore require a lot less of memory if the records are processed in a loop and then discarded. It would be great to support a configurable fetch size instead of force the user to decide wether to read all or nothing.
            diego dupin Diego Dupin added a comment -

            Some good news Paolo : that's in the roadmap for next version 1.4.0.

            diego dupin Diego Dupin added a comment - Some good news Paolo : that's in the roadmap for next version 1.4.0.
            wlad Vladislav Vaintroub added a comment - - edited

            bazzip ,alas, is no principal difference between streamed mode and configurable fetch size. streaming mode requires least memory though
            Are you unhappy to be "forced" into this mode?

            wlad Vladislav Vaintroub added a comment - - edited bazzip ,alas, is no principal difference between streamed mode and configurable fetch size. streaming mode requires least memory though Are you unhappy to be "forced" into this mode?
            bazzip Paolo Bazzi added a comment -

            @Diego very nice to hear!

            @Vladislav
            Two issues with the streamed mode

            • We use shared java code which is executed on both MariaDB and Oracle databases (some kind of data replication). The fetch size is set for each statement according to business logic and expected statement result size. With this setup we run into OutOfMemoryException problems with large data sets and the MariaDB JDBC driver, since the driver tried to load all data into memory. We were forced to implement a "if oracle then use fetchSize else use Integer.MIN_VALUE" Hack to solve the problem
            • I would expect a performance gain using an adequate fetch size instead of the streaming mode, which requires a JDBC driver <-> database server network round trip for each fetched result row
            bazzip Paolo Bazzi added a comment - @Diego very nice to hear! @Vladislav Two issues with the streamed mode We use shared java code which is executed on both MariaDB and Oracle databases (some kind of data replication). The fetch size is set for each statement according to business logic and expected statement result size. With this setup we run into OutOfMemoryException problems with large data sets and the MariaDB JDBC driver, since the driver tried to load all data into memory. We were forced to implement a "if oracle then use fetchSize else use Integer.MIN_VALUE" Hack to solve the problem I would expect a performance gain using an adequate fetch size instead of the streaming mode, which requires a JDBC driver <-> database server network round trip for each fetched result row

            I agree on portability, but I doubt you will gain any performance

            the driver does exactly the same amount of network reads, and the server the same amount of writes.
            The server writes whole result set, the client reads the whole result set, Oracle may and actually does perform very differently.

            wlad Vladislav Vaintroub added a comment - I agree on portability, but I doubt you will gain any performance the driver does exactly the same amount of network reads, and the server the same amount of writes. The server writes whole result set, the client reads the whole result set, Oracle may and actually does perform very differently.
            diego dupin Diego Dupin made changes -
            Fix Version/s 1.5.0 [ 19607 ]
            Fix Version/s 1.4.0 [ 19606 ]
            diego dupin Diego Dupin made changes -
            Fix Version/s 1.4.0 [ 19606 ]
            Fix Version/s 1.5.0 [ 19607 ]
            diego dupin Diego Dupin added a comment -

            This is now implemented on version 1.4.0.

            Like Vladislav say, since all datas have to be read, performance doesn't change a lot,

            JMH results (source https://codeshare.io/OlBRO) when streaming 100,000 rows

            Bench.fetchSizeBy1000 : 50.274 ± 0.206 ms/op (read with fetch size 1000)
            Bench.fetchSizeAll : 50.593 ± 0.252 ms/op (read all data)
            Bench.fetchSizeOneByOne : 51.641 ± 0.299 ms/op (fetch one by one)

            No big difference, but avoiding to create a big buffer permit to gain a small 1%, ( and avoid loading all in memory)

            diego dupin Diego Dupin added a comment - This is now implemented on version 1.4.0. Like Vladislav say, since all datas have to be read, performance doesn't change a lot, JMH results (source https://codeshare.io/OlBRO ) when streaming 100,000 rows Bench.fetchSizeBy1000 : 50.274 ± 0.206 ms/op (read with fetch size 1000) Bench.fetchSizeAll : 50.593 ± 0.252 ms/op (read all data) Bench.fetchSizeOneByOne : 51.641 ± 0.299 ms/op (fetch one by one) No big difference, but avoiding to create a big buffer permit to gain a small 1%, ( and avoid loading all in memory)
            diego dupin Diego Dupin made changes -
            Component/s Other [ 12201 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 70154 ] MariaDB v4 [ 134672 ]

            People

              diego dupin Diego Dupin
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.