A race condition may occur between the execution of transaction commit, and an execution of a KILL statement that would attempt to abort that transaction.
If you look carefully into the above, you can conclude that thd->free_connection() can be called concurrently with KILL/thd->awake(). Which is the bug. And it is partially fixed in THD::~THD(), that is destructor waits for KILL completion:
He is quoting this code of THD::~THD() in 10.5:
And he seems to suggest that the empty critical section should be moved to THD::free_connection(). Note: in 10.2 and 10.3, that code is slightly different:
Nevertheless, it seems that we might want to do something like
It might turn out that the else branch is not needed. The empty lock/unlock pair would of course be added to THD::free_connection().
It might also turn out that all the Galera-specific changes need to be done in THD::free_connection(). (In that case, we would likely want to assign wsrep_rgi= NULL).
As part of this fix, the trx_t::free() instrumentation that was modified in
MDEV-22782 should be tightened: trx_t::mysql_thd and trx_t::state must be poisoned, because innobase_kill_connection() should no longer be invoked on a freed transaction of a freed connection. This should of course be validated with an RQG run similar to the one that reproduced MDEV-17092.