[CONJ-419] Insert Query failed after failover and not able to get master connection Created: 2017-01-30 Updated: 2017-11-21 Resolved: 2017-11-21 |
|
| Status: | Closed |
| Project: | MariaDB Connector/J |
| Component/s: | Failover |
| Affects Version/s: | 1.5.6 |
| Fix Version/s: | 2.1.0, 1.6.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Seonmi Anderson | Assignee: | Diego Dupin |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Environment: |
AWS Aurora |
||
| Issue Links: |
|
||||||||
| Description |
|
Inserting records failed after failover and throws an exception as following; Any exception starting with '08xxx' or '25Sxxx' handled and reset connection. Error code '1290' is treated general SQL exception and eventually application instance should be restarted to fix it. |
| Comments |
| Comment by Diego Dupin [ 2017-01-31 ] | ||||||||||||||||||||||||
|
Hi Seonmi, | ||||||||||||||||||||||||
| Comment by Seonmi Anderson [ 2017-02-01 ] | ||||||||||||||||||||||||
|
Thanks for looking into the issue. | ||||||||||||||||||||||||
| Comment by Diego Dupin [ 2017-02-09 ] | ||||||||||||||||||||||||
|
After failover code review, the only possibility i see for now seems to be aurora master having global/session variable "read_only = ON;", but that require SUPER priviledge. That part will be included in 1.5.8, and when failover error occur, now current instance name will be part of the exception, permitting better analysis if this occur again. | ||||||||||||||||||||||||
| Comment by Paul B Johnson [ 2017-02-16 ] | ||||||||||||||||||||||||
|
I am no DB expert, but I believe I am seeing the same issue. I have a service which is using Spring Data JPA/Hibernate with mariaDB driver to access AWS Aurora DB. I am including the "aurora" keyword in the JDBC connection string. That is, the connection string is "jdbc:mariadb:aurora://<path-to-aws-aurora-cluster>" Everything works fine until I execute a manually-invoked (via AWS console) failover. After a short time, I get the error described above: WARN o.h.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 1290, SQLState: HY000 It seems that the appearance of this error is triggered by some DB activity (the error does not always appear immediately after failover). But I have not yet been able to isolate the trigger. Once it starts to fail, all subsequent DB writes fail with this error. I saw this with mariadb-java-client-1.5.5. I upgraded to 1.5.8 and see the same thing. I do not see where "read only" is set on the DB, but, again, I am no expert. I can attach to cluster with HeidiSQL and read/write without error, so the DB itself is not read-only. Note that failover back to original instance does not fix the problem. Restarting my service is only way I have found so far to get things working again once they start failing. Any suggestions are welcome. Thanks, | ||||||||||||||||||||||||
| Comment by Paul B Johnson [ 2017-02-16 ] | ||||||||||||||||||||||||
|
Just to be super-clear on this: my mariadb driver is configured to talk to the AWS Aurora cluster. The failover is happening in AWS. I was under the impression that this sort of failover would be essentially transparent to the driver. So, I'm confused now. I'm wondering if this could be a bug in AWS clustering?? Do you have any thoughts? | ||||||||||||||||||||||||
| Comment by Seonmi Anderson [ 2017-02-17 ] | ||||||||||||||||||||||||
|
Diego, I tested 1.5.8 yesterday and couldn't be able to resolve the issue. As Paul stated, SQL error continues until ec2 instance is restarted. | ||||||||||||||||||||||||
| Comment by Paul B Johnson [ 2017-02-17 ] | ||||||||||||||||||||||||
|
I have spent more hours today testing. Results are essentially the same. The only new data point is that I tried specifying the DB instance endpoints (instead of the cluster endpoint) in the jdbc: string. Same behavior was observed. It seems that the issue may be triggered by updating a row in the DB after the failover. The impression is that the original (originally writable, now read-only (as set by AWS)) master instance is cached somewhere with the data read from the row and when i go to update the row it tries to write back to that original instance and fails (this is just my impression). | ||||||||||||||||||||||||
| Comment by Paul B Johnson [ 2017-03-06 ] | ||||||||||||||||||||||||
|
Still battling this issue with no progress to report. I did upgrade to tomcat connection pool with hope that this would help, but the failover behavior is the same. Does anyone else have any progress they can report? Diego: is there a preferred connection pool implementation for the mariadb driver? | ||||||||||||||||||||||||
| Comment by Seonmi Anderson [ 2017-03-06 ] | ||||||||||||||||||||||||
|
Paul, You may try dbcp2 pool instead of tomcat pool. Tomcat pool is preferred by Spring. Hikari, dbcp, and dbcp2 can be used as well. Thanks, | ||||||||||||||||||||||||
| Comment by Diego Dupin [ 2017-03-08 ] | ||||||||||||||||||||||||
|
There is not "prefered" pool for driver. Lot of people use Hikari, dbcp, and dbcp2 or tomcat pool. | ||||||||||||||||||||||||
| Comment by Paul B Johnson [ 2017-03-17 ] | ||||||||||||||||||||||||
|
Good news, I think. I believe I have figured out that, for me, this issue is triggered by calling the JPA CrudRepository "findOne()" API (http://docs.spring.io/spring-data/commons/docs/current/api/org/springframework/data/repository/CrudRepository.html?is-external=true#findOne-ID-). After replacing this call with actual queries on the primary ID, the issue seems to have gone away. Diego: does this make any sense? Is there a reason that using findOne() could result in writes going to the wrong DB instance? | ||||||||||||||||||||||||
| Comment by Diego Dupin [ 2017-03-21 ] | ||||||||||||||||||||||||
|
Spring data reading methods like findAll() and findOne(…) are using @Transactional(readOnly = true). That mean using replicates. That's typically to avoid using master server too much. pbj To be sure of the issue : your application use findOne with default @Transactional(readOnly = true) and then, triggering a failover, you have an exception. What is this exception ? (must not be "The MySQL server is running with the --read-only" since that error will be thrown for write operation). | ||||||||||||||||||||||||
| Comment by Paul B Johnson [ 2017-03-21 ] | ||||||||||||||||||||||||
|
The normal sequence is: [Maybe I should not be using findOne() in this situation, but it works in this normal (non-failover) case.] The failover sequence is: The only way I have found to recover at this point is to restart my service. Paul | ||||||||||||||||||||||||
| Comment by Diego Dupin [ 2017-04-03 ] | ||||||||||||||||||||||||
|
Hi pbj This is the normal use of spring data (findOne, update, then save). Didn't you set option "failOnReadOnly" to true, that would lead to that ? If your application use at least java 8, could you reproduced this issue with log enable (profileSql=true) and version 2.0.0-SNAPSHOT ?
with "(M)" indicating master connection and "(S)" slave 2.0.0-SNAPSHOT is available using :
| ||||||||||||||||||||||||
| Comment by Paul B Johnson [ 2017-04-05 ] | ||||||||||||||||||||||||
|
Diego, I have emailed to you the requested log file. Note that I can only reproduce this problem as part of our standard automation test suite. So at this point I suspect there is a very particular sequence of events that results in this problem. Hopefully the log file will help you out. I did not explicitly set "failOnReadOnly" (or "failOverReadOnly" for that matter). Paul | ||||||||||||||||||||||||
| Comment by Diego Dupin [ 2017-04-05 ] | ||||||||||||||||||||||||
|
Hi Paul, Thanks for the logs, i'll check them in detail tomorrow now. I hope to bring good news next time. Diego. | ||||||||||||||||||||||||
| Comment by Paul B Johnson [ 2017-04-06 ] | ||||||||||||||||||||||||
|
Note that since I have found and implemented the work-around of not calling "findOne()", our failover tests all pass. Thus the issue has become less of a priority, and more "academic" in nature. We just hope that there are not similar, but non-resolved, issues lurking. | ||||||||||||||||||||||||
| Comment by Diego Dupin [ 2017-04-10 ] | ||||||||||||||||||||||||
|
See spring-data documentation about @Transactional. Methods that will insert/update database will have to be marked as @Transactional ( = readOnly=false). find* methods, that have implicit @Transactional(readOnly = true), Spring-data will call driver Connection.setReadOnly(true) methods. When using masters/slaves configuration (like aurora) query will be launched on a slave(replica) server. Adding "@Transactional(readOnly = false)" or just "@Transactional" will solve the issue. | ||||||||||||||||||||||||
| Comment by Diego Dupin [ 2017-11-21 ] | ||||||||||||||||||||||||
|
Description correspond to |