Uploaded image for project: 'MariaDB Connector/J'
  1. MariaDB Connector/J
  2. CONJ-115

Big (> 512 MiB) LOAD DATA LOCAL INFILE cannot be sent to server

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.1.7
    • Fix Version/s: 1.1.8
    • Component/s: None
    • Labels:
      None

      Description

      How to reproduce:

      1. Have a relative fast computer. I have an Intel 3770S (3,1GHz) and an SSD drive.
      2. Run the attached test case GiganticLoadDataInfileTest.java

      Psedo code:

      1. Create a table with two columns: id and name.
      2. Create a LOAD DATA INFILE file that is gigantic. I used 30 million rows and a total byte size of 637777795 bytes.
      3. Run a LOAD DATA LOCAL INFILE query to insert the data into the table

      Actual:
      After about 58 seconds about 25402152 rows are inserted into the table and you get this exception:

      java.sql.SQLNonTransientConnectionException: Could not read resultset: MySQL protocol limit reached, you cannot send more than 4GB of data
      	at org.mariadb.jdbc.internal.SQLExceptionMapper.get(SQLExceptionMapper.java:136)
      	at org.mariadb.jdbc.internal.SQLExceptionMapper.throwException(SQLExceptionMapper.java:106)
      	at org.mariadb.jdbc.MySQLStatement.executeQueryEpilog(MySQLStatement.java:264)
      	at org.mariadb.jdbc.MySQLStatement.execute(MySQLStatement.java:291)
      	at org.mariadb.jdbc.MySQLStatement.execute(MySQLStatement.java:341)
      	at org.mariadb.jdbc.GiganticLoadDataInfileTest.giganticLoadDataInfileTest(GiganticLoadDataInfileTest.java:52)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:601)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
      	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
      	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
      	at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
      	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
      	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
      	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
      	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
      Caused by: org.mariadb.jdbc.internal.common.QueryException: Could not read resultset: MySQL protocol limit reached, you cannot send more than 4GB of data
      	at org.mariadb.jdbc.internal.mysql.MySQLProtocol.getResult(MySQLProtocol.java:930)
      	at org.mariadb.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:995)
      	at org.mariadb.jdbc.MySQLStatement.execute(MySQLStatement.java:284)
      	... 26 more
      Caused by: java.io.IOException: MySQL protocol limit reached, you cannot send more than 4GB of data
      	at org.mariadb.jdbc.internal.common.packet.PacketOutputStream.write(PacketOutputStream.java:84)
      	at org.mariadb.jdbc.internal.common.packet.PacketOutputStream.sendFile(PacketOutputStream.java:63)
      	at org.mariadb.jdbc.internal.mysql.MySQLProtocol.getResult(MySQLProtocol.java:923)
      	... 28 more

      Expected:
      The test case should pass and there should be 30 million rows in the table.

      This test case works with Mysql JDBC (I tested with version 5.1.26). It took about 67 seconds to execute.

      It should be possible to send up more data to the server (I'm not sure about the exact limit, maybe 1 GiB). This LOAD DATA INFILE was only 637777795 bytes (about 608 MiB).

      Analysis:
      I’m probably part guilty to this error. I sent a patch (via e-mail to Anders Karlsson) that caused the commit http://bazaar.launchpad.net/~maria-captains/mariadb-java-client/trunk/revision/505. I also pointed to a way to fix for LOAD DATA INFILE (http://bazaar.launchpad.net/~maria-captains/mariadb-java-client/trunk/revision/506) without testing it for the extreme cases.

      To understand this problem you must first understand a Mysql packet: http://dev.mysql.com/doc/internals/en/mysql-packet.html.
      The payload length field has the data type “type 3”: 3 byte Protocol::FixedLengthInteger
      There is a sequence id/number that has the data type “type 1”:1 byte Protocol::FixedLengthInteger.

      If more than 16 MB of data is to be sent the packet is to be split in 16MB chunks: http://dev.mysql.com/doc/internals/en/sending-more-than-16mbyte.html. The Sequence ID should be incremented for each chunk.

      The documentation is not very clear about a maximum size of a query. One interpretation is that the largest query that can be sent is 16 MiB * 256 = 4096 MiB = 4 GiB. That is, the query is split into 256 smaller chunks, each of 16 MiB in size. But since http://dev.mysql.com/doc/internals/en/sequence-id.html states “and may wrap around” I’m not sure. There also the page http://dev.mysql.com/doc/refman/5.5/en/packet-too-large.html that states “The largest possible packet that can be transmitted to or from a MySQL 5.5 server or client is 1GB.”

      From what I can see in the code (http://bazaar.launchpad.net/~maria-captains/mariadb-java-client/trunk/view/head:/src/main/java/org/mariadb/jdbc/internal/common/packet/PacketOutputStream.java) the method sendFile() creates a buffer of size 8192 bytes. Each 8 KiB chunk is causing a write and a flush, thus increasing the seqNo attribute. When seqNo is 0xffff the Exception “MySQL protocol limit reached, you cannot send more than 4GB of data” is thrown. In total 8 KiB * 0xffff = 536870912 bytes = 512 GiB has been sent.

      The commit http://bazaar.launchpad.net/~maria-captains/mariadb-java-client/trunk/revision/505 changed the max seqNo from 256 to 65536. I’m not sure if that is correct or not. However only the least significant 8 bits are sent in the packet since it is truncated at row 121:

      byteBuffer[SEQNO_OFFSET] = (byte)seqNo;

      My small tests shows that the server has no problem handling the wrapping of Sequence ID.

      Suggested Solution:
      A naive solution to this is to change the buffer in sendFile() to 16MiB. But that does not work very well if the max_allowed_packet my.cnf variable is set lower. I tried to increase max_allowed_packet to 128M but even then got problem.

      It would be possible get the max_allowed_packet from the session and/or compare with a preset value (for example 1 MiB).

      Other solutions would be to:
      Increase MAX_SEQNO from 65536 to 524288 (4 GiB/8192)
      Increase MAX_SEQNO from 65536 to 131072 (1 GiB/8192).

      From a performance view a slightly larger buffer could be good for larger network ping distances.

        Attachments

          Activity

            People

            Assignee:
            massimo.siani Massimo Siani (Inactive)
            Reporter:
            lennartschedin Lennart Schedin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.