[CONJ-1011] NullpointerException when cancelling a query from an other thread Created: 2022-09-19  Updated: 2022-09-30  Resolved: 2022-09-30

Status: Closed
Project: MariaDB Connector/J
Component/s: Other
Affects Version/s: 2.7.5
Fix Version/s: 2.7.7

Type: Bug Priority: Major
Reporter: Roland Praml Assignee: Diego Dupin
Resolution: Fixed Votes: 0
Labels: None
Environment:

Java 11 / Linux / JavaMelody (but should not be related)



 Description   

When cancelling an other query from an other thread, you might get an NPE in rare cases:

java.lang.NullPointerException
	at org.mariadb.jdbc.MariaDbStatement.skipMoreResults(MariaDbStatement.java:1187)
	at org.mariadb.jdbc.MariaDbStatement.cancel(MariaDbStatement.java:981)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at net.bull.javamelody.JdbcWrapper$StatementInvocationHandler.invoke(JdbcWrapper.java:163)
	at net.bull.javamelody.JdbcWrapper$DelegatingInvocationHandler.invoke(JdbcWrapper.java:306)
	at com.sun.proxy.$Proxy345.cancel(Unknown Source)
	at io.ebean.util.JdbcClose.cancel(JdbcClose.java:76)

As you see, 'protocol' is null in https://github.com/mariadb-corporation/mariadb-connector-j/blob/2.7.5/src/main/java/org/mariadb/jdbc/MariaDbStatement.java#L1187

I investigate a bit and have the following explanation for the bug:

Thread #1 starts a query

Thread #2 tries to cancel the query.
It invokes L980 protocol.cancelCurrentQuery() successfully

Thread #1 now closes the query and sets protocol = null

When Thread #2 tries to execute L981, which calls L1187 protocol.skip() - you get an NPE.

There is no kind of locking the access of protocol and this can be modified in a concurrent thread.

Unfortunately, I cannot provide a test. The bug is very hard to reproduce and happens only in production...

Side note: I think the bug may not happen with 3.x, because the code is totally different (but we did not update to 3.x, yet)



 Comments   
Comment by Diego Dupin [ 2022-09-30 ]

Thanks for detailed explaination.
This will be corrected in next version.

Generated at Thu Feb 08 03:19:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.