Uploaded image for project: 'MariaDB Connector/J'
  1. MariaDB Connector/J
  2. CONJ-769

Aurora reader instances silently blacklisted when argument createDatabaseIfNotExist used

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Duplicate
    • 2.5.4
    • 3.0.8
    • aurora, Failover
    • None
    • Spring Framework 4
      AWS MySQL Aurora 1.17.8

    Description

      We recently replaced the MySQL Connector/J library with the MariaDB Connector/J 2.4.4 because we wanted to benefit from its load balancing and failover capabilities for AWS Aurora. As soon as we replaced it, the error log from our AWS MySQL Aurora instance was flodded with the following error:

      2020-02-19 12:40:26 25570 [Warning] Aborted connection 1279858 to db: 'unconnected' user: 'DB_USER' host: '12.13.14.15' (Got an error reading communication packets)
      

      We never got any load on the reader instance and did not see any other errors in our application logs.

      It was strange that the error logs all showed `unconnected` as the database name, which obviously does not exist. We first thought that this might be a timeout issue, but the error logs were filled periodically with hundreds of these errors, independently of the actual database load. We were aware that the MariaDB driver has a 10 second default timeout, but we have hardly any queries which would run that long.

      We wrote a small test program with the same settings as in our production system and could then see the load balancing was indeed working and we also could provoke the similar error by running queries which take longer than 10 seconds, for instance with a simple SLEEP(11). In that case the database we connected to was contained in the error, which was different from the real system where we saw the `unconnected` database. In that case we got a proper logging message from the Driver where it stated that the query was replayed on the master/writer.

      We then debugged our application and the MariaDB driver in our identically configured test environment. We soon realized that the reader instance is immediately blacklisted and thus not used at all. When the connection is obtained, an exception is silently thrown that contains the hint we needed to apply a quite simple fix. The error is thrown in the class AbstractConnectionProtocol in line 347.

      SQL State: HY000
      Vendor code: 1290
       
      java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement
       
      Caught exception: java.sql.SQLNonTransientConnectionException: Could not connect to HostAddress{host='READER_INSTANCE.rds.amazonaws.com', port=3306}. The MySQL server is running with the --read-only option so it cannot execute this statement
      

      Since all the queries should only be read-only, it was suspicious that the queries were detected as non-read-only. A breakpoint at the exception which was thrown then showed the culprit:

      CREATE DATABASE IF NOT EXISTS `our-database-name`;
      

      The problem we had was that we had the argument `createDatabaseIfNotExist=true` in our JDBC URL because for the tests we wanted to ensure that the database exists before we load the test or initial data. We used the JDBC URL with this parameter for many years without noticing the side effect. We have not been aware that the driver would send the CREATE DATABASE statement during all connection initializations. Since the CREATE DATABASE statement is obviously not read-only, the MySQL server responds with an error and the connection to the reader gets blacklisted.

      The exception is thrown but not logged, because it is handled within a conditional block that does not apply to that error case. For this reason the exception goes without notice and does not show up in any logs. The connection is still created but not properly closed, which is why the MySQL server detects an aborted connection to an undefined database.

      We think that the exception that causes a reader instance to be blacklisted should be visible in the logs because then it would be more transparent why a certain instance was removed. I linked the issue CONJ-577 because it is very likely a duplicate.

      Removing the

      createDatabaseIfNotExist=true
      

      argument immediately fixed the problem and made the reader instance available for the load balancer. It could also be mentioned in the Aurora specific documentation for the driver.
      I am happy provide a small pull request for updating the docs.

      Attachments

        Issue Links

          Activity

            People

              diego dupin Diego Dupin
              stefan.proell Stefan Proell
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.