[CONJ-860] Possible Deadlocks due to AbstractQueryProtocol.setTimeout Created: 2021-02-17  Updated: 2021-05-17  Resolved: 2021-05-17

Status: Closed
Project: MariaDB Connector/J
Component/s: aurora
Affects Version/s: 2.6.0
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: M Z Assignee: Diego Dupin
Resolution: Cannot Reproduce Votes: 0
Labels: None

Attachments: File bgthreads.dmp    

 Description   

Hi,

we are occasionally seeing (roughly once per day) threads get stuck in our Application that don't recover until we restart the app. Looking at the threads via `jstack <pid> ` we see that they seem to be blocked due to `AbstractQueryProtocol.setTimeout`.

We use an RDS Aurora Cluster with engine version 5.7.mysql_aurora.2.07.2.

Examples stack traces:

Thread 20226: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=175 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=836 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, int) @bci=67, line=870 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, line=1199 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, line=209 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=285 (Compiled frame)
 - org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.setTimeout(int) @bci=4, line=1444 (Compiled frame)
 - sun.reflect.GeneratedMethodAccessor57.invoke(java.lang.Object, java.lang.Object[]) @bci=114 (Compiled frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=43 (Compiled frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=56, line=498 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.impl.MastersSlavesListener.invoke(java.lang.reflect.Method, java.lang.Object[]) @bci=142, line=233 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.FailoverProxy.executeInvocation(java.lang.reflect.Method, java.lang.Object[], boolean) @bci=17, line=301 (Compiled frame)
 - java.lang.String.hashCode() @bci=33, line=1471 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.FailoverProxy.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=15, line=163 (Compiled frame)
 - com.sun.proxy.$Proxy55.setTimeout(int) @bci=19 (Compiled frame)
 - org.mariadb.jdbc.MariaDbConnection.setNetworkTimeout(java.util.concurrent.Executor, int) @bci=72, line=1669 (Compiled frame)
 - com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(java.sql.Connection, long) @bci=15, line=556 (Compiled frame)
 - com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(java.sql.Connection) @bci=6, line=152 (Compiled frame)
 - com.zaxxer.hikari.pool.HikariPool.getConnection(long) @bci=76, line=185 (Compiled frame)
 - com.zaxxer.hikari.pool.HikariPool.getConnection() @bci=5, line=161 (Compiled frame)
 - com.zaxxer.hikari.HikariDataSource.getConnection() @bci=50, line=100 (Compiled frame)
...

Thread 20472: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=175 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=836 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, int) @bci=67, line=870 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, line=1199 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, line=209 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=285 (Compiled frame)
 - org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.setTimeout(int) @bci=4, line=1444 (Compiled frame)
 - sun.reflect.GeneratedMethodAccessor57.invoke(java.lang.Object, java.lang.Object[]) @bci=114 (Compiled frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=43 (Compiled frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=56, line=498 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.impl.MastersSlavesListener.invoke(java.lang.reflect.Method, java.lang.Object[]) @bci=142, line=233 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.FailoverProxy.executeInvocation(java.lang.reflect.Method, java.lang.Object[], boolean) @bci=17, line=301 (Compiled frame)
 - java.lang.String.hashCode() @bci=33, line=1471 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.FailoverProxy.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=15, line=163 (Compiled frame)
 - com.sun.proxy.$Proxy55.setTimeout(int) @bci=19 (Compiled frame)
 - org.mariadb.jdbc.MariaDbConnection.setNetworkTimeout(java.util.concurrent.Executor, int) @bci=72, line=1669 (Compiled frame)
 - com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(java.sql.Connection, long) @bci=15, line=556 (Compiled frame)
 - com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(java.sql.Connection) @bci=6, line=152 (Interpreted frame)
 - com.zaxxer.hikari.pool.HikariPool.getConnection(long) @bci=76, line=185 (Interpreted frame)
 - com.zaxxer.hikari.pool.HikariPool.getConnection() @bci=5, line=161 (Interpreted frame)
 - com.zaxxer.hikari.HikariDataSource.getConnection() @bci=50, line=100 (Interpreted frame)
...

We have the following parameters configured:

ImmutableMap.<String, String>builder()
                .put("cachePrepStmts", "true")
                .put("prepStmtCacheSize", "500")
                .put("prepStmtCacheSqlLimit", "2048")
                .put("useServerPrepStmts", "true")
                .put("rewriteBatchedStatements", "true")
                .put("cacheResultSetMetadata", "true")
                .put("cacheServerConfiguration", "true")
                .put("socketTimeout", "60000")
                .put("serverTimezone", "UTC")
                .put("dumpQueriesOnException", "true")
                .put("useCompression", "true")
                .put("log", "true")
                .put("useUnicode", "true")
                .put("connectionCollation", "utf8mb4_unicode_ci")
                .put("characterSetResults", "utf8")
                .put("zeroDateTimeBehavior", "convertToNull")
                .put("enablePacketDebug", "true")
                .build();

Any help would be appreciated.



 Comments   
Comment by Diego Dupin [ 2021-02-17 ]

This lock is there to ensure multi-threading won't create any issue.
Hikari getConnection from pool indicate that this is a connection that has been released to pool.
So assuming that a connection given back to pool is still not used incorrectly, that would mean there is an issue in connector.

Checking code, all use of lock is using finally to ensure unlocking ... but there is 2 places where this unlocking in finally can have unexpected result, because this finally might be killing timeout thread before unlocking.

Do you use statement.setQueryTimeout(int seconds) ?
for mysql and mariadb version before 10.2, a thread will be controlling timeout.
That use + some non standard security manager (like not permitted to cancel a thread) might result in not removing the lock.
this would result in that kind of error.

a remark : it would make a lot more sense to use hikari idleTimeout than socketTimeout at connector level.

Comment by M Z [ 2021-02-18 ]

Thanks for your quick reply.

Do you use statement.setQueryTimeout(int seconds) ?

We do not explicitly set this as we do not work on statements whatsoever. We only use Hibernate 5.4 for queries.

So assuming that a connection given back to pool is still not used incorrectly, that would mean there is an issue in connector.

Does that mean that a connection is not closed properly from our end?

some non standard security manager

We do not set a security manager

a remark : it would make a lot more sense to use hikari idleTimeout than socketTimeout at connector level.

Thanks, I will adjust that

Some more info:
This particular application executes a lot of background tasks that load data from an external source and then upserts that data into our aurora cluster. The alleged deadlocks seem to happen when a transaction is switched from readonly to write mode. When it happened last time I saw at least 7 threads in the thread dump that seemed to be stuck on that ReentrantLock from Mariadb Connector.

One full stack trace with our applicaton's traces anonymized:

Thread 20472: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=175 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=836 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, int) @bci=67, line=870 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, line=1199 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, line=209 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=285 (Compiled frame)
 - org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.setTimeout(int) @bci=4, line=1444 (Compiled frame)
 - sun.reflect.GeneratedMethodAccessor57.invoke(java.lang.Object, java.lang.Object[]) @bci=114 (Compiled frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=43 (Compiled frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=56, line=498 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.impl.MastersSlavesListener.invoke(java.lang.reflect.Method, java.lang.Object[]) @bci=142, line=233 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.FailoverProxy.executeInvocation(java.lang.reflect.Method, java.lang.Object[], boolean) @bci=17, line=301 (Compiled frame)
 - java.lang.String.hashCode() @bci=33, line=1471 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.FailoverProxy.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=15, line=163 (Compiled frame)
 - com.sun.proxy.$Proxy55.setTimeout(int) @bci=19 (Compiled frame)
 - org.mariadb.jdbc.MariaDbConnection.setNetworkTimeout(java.util.concurrent.Executor, int) @bci=72, line=1669 (Compiled frame)
 - com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(java.sql.Connection, long) @bci=15, line=556 (Compiled frame)
 - com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(java.sql.Connection) @bci=6, line=152 (Interpreted frame)
 - com.zaxxer.hikari.pool.HikariPool.getConnection(long) @bci=76, line=185 (Interpreted frame)
 - com.zaxxer.hikari.pool.HikariPool.getConnection() @bci=5, line=161 (Interpreted frame)
 - com.zaxxer.hikari.HikariDataSource.getConnection() @bci=50, line=100 (Interpreted frame)
 - org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection() @bci=48, line=122 (Interpreted frame)
 - org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection() @bci=13, line=386 (Interpreted frame)
 - org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded() @bci=21, line=84 (Compiled frame)
 - org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection() @bci=5, line=109 (Compiled frame)
 - org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement() @bci=1, line=227 (Compiled frame)
 - org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.begin() @bci=2, line=234 (Interpreted frame)
 - org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.begin() @bci=8, line=214 (Interpreted frame)
 - org.hibernate.engine.transaction.internal.TransactionImpl.begin() @bci=47, line=52 (Interpreted frame)
 - org.hibernate.internal.SessionImpl.beginTransaction() @bci=41, line=1525 (Interpreted frame)
 - org.hibernate.jpa.internal.TransactionImpl.begin() @bci=42, line=45 (Interpreted frame)
 - org.springframework.orm.jpa.vendor.HibernateJpaDialect.beginTransaction(javax.persistence.EntityManager, org.springframework.transaction.TransactionDefinition) @bci=138, line=189 (Interpreted frame)
 - org.springframework.orm.jpa.JpaTransactionManager.doBegin(java.lang.Object, org.springframework.transaction.TransactionDefinition) @bci=148, line=380 (Interpreted frame)
 - org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(org.springframework.transaction.TransactionDefinition) @bci=204, line=377 (Interpreted frame)
 - org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(org.springframework.transaction.PlatformTransactionManager, org.springframework.transaction.interceptor.TransactionAttribute, java.lang.String) @bci=37, line=464 (Interpreted frame)
 - org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(java.lang.reflect.Method, java.lang.Class, org.springframework.transaction.interceptor.TransactionAspectSupport$InvocationCallback) @bci=51, line=277 (Interpreted frame)
 - org.springframework.transaction.interceptor.TransactionInterceptor.invoke(org.aopalliance.intercept.MethodInvocation) @bci=40, line=96 (Compiled frame)
 - org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() @bci=101, line=179 (Compiled frame)
 - org.springframework.aop.framework.JdkDynamicAopProxy.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=385, line=213 (Compiled frame)
 - com.sun.proxy.$Proxy107.backgroundJob(java.util.Map) @bci=16 (Interpreted frame)
 - com.example.backgroundjob() @bci=300, line=96 (Interpreted frame)
 - com.example.backgroundJobPrepare() @bci=14, line=53 (Interpreted frame)
 - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method, java.lang.Object, java.lang.Object[]) @bci=0 (Compiled frame)
 - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=100, line=62 (Compiled frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=43 (Compiled frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=56, line=498 (Compiled frame)
 - org.springframework.scheduling.support.ScheduledMethodRunnable.run() @bci=19, line=65 (Interpreted frame)
 - org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run() @bci=4, line=54 (Interpreted frame)
 - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 (Interpreted frame)
 - java.util.concurrent.FutureTask.runAndReset() @bci=47, line=308 (Interpreted frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) @bci=1, line=180 (Interpreted frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() @bci=37, line=294 (Interpreted frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1149 (Interpreted frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=749 (Interpreted frame)

Comment by Diego Dupin [ 2021-02-19 ]

Even using setQueryTimeout with security manager that doesn't permit to interrupt thread possibility in fact doesn't exists. Wrong assumption.

If you use hibernate only, there is no chance that a connection that has been closed (given back to pool) is referenced somewhere else and use in some other thread for query. (that would clearly cause this).

Each lock is assured to be unlocked. It may take some time with auto-reconnection taking place, but hikary reset state of connection before putting connection back in queue. So connection cannot be given back even for validation (or for old version of hikaricp ?)

I would check if there was some connection issue around the time of the lock first.

Comment by M Z [ 2021-02-19 ]

We're using HikariCP 3.4.2 and Hibernate 5.1.0, sorry 5.4.0 was false information.

One particular job got stuck today at 2021-02-19T10:06:11.450+01:00 on our test environment and was still stuck as of writing this comment 2021-02-19T17:00:00+01:00. Stack:

Thread 20182: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=175 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=836 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, int) @bci=67, line=870 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, line=1199 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, line=209 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=285 (Compiled frame)
 - org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.setTimeout(int) @bci=4, line=1444 (Compiled frame)
 - sun.reflect.GeneratedMethodAccessor59.invoke(java.lang.Object, java.lang.Object[]) @bci=114 (Compiled frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=43 (Compiled frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=56, line=498 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.impl.MastersSlavesListener.invoke(java.lang.reflect.Method, java.lang.Object[]) @bci=142, line=233 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.FailoverProxy.executeInvocation(java.lang.reflect.Method, java.lang.Object[], boolean) @bci=17, line=301 (Compiled frame)
 - org.mariadb.jdbc.internal.failover.FailoverProxy.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=1500, line=294 (Compiled frame)
 - com\.sun\.proxy\.\$Proxy55.run(int) @bci=19 (Compiled frame)
 - org.mariadb.jdbc.MariaDbConnection.setNetworkTimeout(java.util.concurrent.Executor, int) @bci=72, line=1669 (Compiled frame)
 - com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(java.sql.Connection, long) @bci=15, line=556 (Compiled frame)
 - com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(java.sql.Connection) @bci=6, line=152 (Compiled frame)
 - com.zaxxer.hikari.pool.HikariPool.getConnection(long) @bci=76, line=185 (Compiled frame)
 - com.zaxxer.hikari.pool.HikariPool.getConnection() @bci=5, line=161 (Compiled frame)
 - com.zaxxer.hikari.HikariDataSource.getConnection() @bci=50, line=100 (Compiled frame)
 - org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection() @bci=48, line=122 (Compiled frame)
 - org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection() @bci=13, line=386 (Compiled frame)
 - org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded() @bci=21, line=84 (Compiled frame)
 - org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection() @bci=5, line=109 (Compiled frame)
 - org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement() @bci=1, line=227 (Compiled frame)
 - org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.begin() @bci=2, line=234 (Compiled frame)
 - org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.begin() @bci=8, line=214 (Compiled frame)
 - org.hibernate.engine.transaction.internal.TransactionImpl.begin() @bci=47, line=52 (Compiled frame)
 - org.hibernate.internal.SessionImpl.beginTransaction() @bci=41, line=1525 (Compiled frame)
 - org.hibernate.jpa.internal.TransactionImpl.begin() @bci=42, line=45 (Compiled frame)
 - org.springframework.orm.jpa.vendor.HibernateJpaDialect.beginTransaction(javax.persistence.EntityManager, org.springframework.transaction.TransactionDefinition) @bci=138, line=189 (Compiled frame)
 - org.springframework.orm.jpa.JpaTransactionManager.doBegin(java.lang.Object, org.springframework.transaction.TransactionDefinition) @bci=148, line=380 (Compiled frame)
 - org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(org.springframework.transaction.TransactionDefinition) @bci=204, line=377 (Compiled frame)
 - org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(org.springframework.transaction.PlatformTransactionManager, org.springframework.transaction.interceptor.TransactionAttribute, java.lang.String) @bci=37, line=464 (Compiled frame)
 - org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(java.lang.reflect.Method, java.lang.Class, org.springframework.transaction.interceptor.TransactionAspectSupport$InvocationCallback) @bci=51, line=277 (Compiled frame)
 - org.springframework.transaction.interceptor.TransactionInterceptor.invoke(org.aopalliance.intercept.MethodInvocation) @bci=40, line=96 (Compiled frame)
 - org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() @bci=101, line=179 (Compiled frame)
 - org.springframework.aop.framework.JdkDynamicAopProxy.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=385, line=213 (Compiled frame)
 - com\.sun\.proxy\.\$Proxy130.run(java.util.Map, java.util.Set) @bci=20 (Compiled frame)
 - com.example.run(Compiled frame)
 - com.example.run(Interpreted frame)
 - com.example.run() @bci=46, line=116 (Interpreted frame)
 - com.example.run(Interpreted frame)
 - org.springframework.cglib.proxy.MethodProxy.invoke(java.lang.Object, java.lang.Object[]) @bci=19, line=204 (Interpreted frame)
 - org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint() @bci=19, line=737 (Interpreted frame)
 - org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() @bci=19, line=157 (Compiled frame)
 - org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed() @bci=9, line=84 (Interpreted frame)
 - com.example.run(Interpreted frame)
 - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method, java.lang.Object, java.lang.Object[]) @bci=0 (Compiled frame)
 - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=100, line=62 (Compiled frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=43 (Compiled frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=56, line=498 (Compiled frame)
 - org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(java.lang.Object[]) @bci=36, line=627 (Interpreted frame)
 - org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(org.aspectj.lang.JoinPoint, org.aspectj.weaver.tools.JoinPointMatch, java.lang.Object, java.lang.Throwable) @bci=10, line=616 (Interpreted frame)
 - org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(org.aopalliance.intercept.MethodInvocation) @bci=58, line=70 (Interpreted frame)
 - org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() @bci=85, line=168 (Compiled frame)
 - org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(org.aopalliance.intercept.MethodInvocation) @bci=18, line=92 (Interpreted frame)
 - org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() @bci=101, line=179 (Compiled frame)
 - org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(java.lang.Object, java.lang.reflect.Method, java.lang.Object[], org.springframework.cglib.proxy.MethodProxy) @bci=120, line=672 (Interpreted frame)
 - com.example.run() @bci=31 (Interpreted frame)
 - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method, java.lang.Object, java.lang.Object[]) @bci=0 (Compiled frame)
 - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=100, line=62 (Compiled frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=43 (Compiled frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=56, line=498 (Compiled frame)
 - org.springframework.scheduling.support.ScheduledMethodRunnable.run() @bci=19, line=65 (Interpreted frame)
 - org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run() @bci=4, line=54 (Interpreted frame)
 - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 (Compiled frame)
 - java.util.concurrent.FutureTask.runAndReset() @bci=47, line=308 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) @bci=1, line=180 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() @bci=37, line=294 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1149 (Interpreted frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=749 (Interpreted frame)

I've also attached the entire jstack dump (with packages/classes/methods from us anonymized due to company policy): https://jira.mariadb.org/secure/attachment/56290/bgthreads.dmp

Comment by M Z [ 2021-05-03 ]

Hi,

just to provide an update here:
I've suspected the Reader/Writer "load balancing" to be the culprit so I switched from "aurora" jdbc mode to "mariadb" and the deadlocks haven't re-occurred since. Not sure if this is really the cause or I've been lucky since (~1 week).

Comment by Diego Dupin [ 2021-05-17 ]

no news since, so i presume you are still lucky ?
MariaDB server doesn't use that old behavior for setTimeout, but use a dedicated command `SET STATEMENT max_statement_time=xxx FOR <query>` permitting exact timeout server side, not leaving server continuing handling query, even if client doesn't.

Since we have no Aurora server to try to reproduced that.
Either we can have some network strace (wireshark) either i'll close this issue

Comment by Diego Dupin [ 2021-05-17 ]

Since we have no Aurora server to try to reproduced that, i'll close this issue.
Feel free to create a new ticket link to this one if reproduced

Comment by M Z [ 2021-05-17 ]

Hi,

yes, this has not occurred once since we switched away from the aurora mode.
To be clear, we changed the jdbc url from: jdbc:mariadb:aurora// to jdbc:mariadb://.

Feel free to close as I am not allowed to provide wireshark traces due to company policy .

Generated at Thu Feb 08 03:18:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.