Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not a Bug
-
2.7.2, 2.7.8
-
None
Description
We are seeing customers thread dumps showing hundreds of threads blocked waiting on MariaDbStatement.executeInternal on line 340 (in v2.7.2) which is a lock.lock() call. Although I have no repro, I suspect somewhere the lock is not being unlocked.
EDIT: The following paragraph talks about 2.7.2 code, which it looks like has since been patched and the code I mention is no longer in 2.7.8. Apologies, I was unaware. However, we are still seeing customers with hundreds of threads waiting on the locks in 2.7.8.
A code inspection shows that there are finally-blocks around the code to do the unlock but there are lines of code that can throw RuntimeException before the unlock occurs, which would essentially make it so the unlock() call is not reached. For example, executeInternal's finally block calls executeEpilogue before it unlocks. Unfortunately, a subroutine, stopTimeoutTask(), calls Future#get() which throws CancellationException which is a RuntimeException and would not be caught, preventing the unlock. Additionally stopTimeoutTask() calls Thread.currentThread().interrupt() which can also throw SecurityException which is another RuntimeException.
ASK: Please ensure that unlock() is called for all the methods that do locking. Either ensure that prior methods in the finally block cannot throw any exception or unlock first, if possible. (EDIT: perhaps this means wrapping all methods before the unlock with a try/catch since underlying methods do not need to declare any RuntimeExceptions thrown)
Affected versions: 2.7.2 and 2.7.8 for sure, but also probably the versions in between and possibly earlier versions as well.
Using server mariadb server v10.6.11
Repro: None
Just curious, are you using mysql server or mariadb < 10.2 for the connector to use the timeout using another thread?
I'm trying to identify what may be causing this.
this Future.get is surrounded by try catch
timerTaskFuture.get();
}
}
}
}
So I wonder how it can't be caught. In this case, catch could be even bigger to make sure there is no problem, but that should normally catch all possible exceptions thrown there.
Do not follow you mean by that.
Since you have hundreds of locks, there's probably something wrong, but right now I don't see how it can happen...
btw, could you tell which Java implementation you are using (there was an issue with IBM jdk a long time ago about threading)?