Details
-
Task
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Won't Fix
-
None
Description
Currently when we need to initiate transaction rollback from the thread which does not own the transaction, i.e. from Deadlock::report(), or from lock_wait_wsrep_kill(), or from innobase_kill_query(), we "cancel" and release waiting lock(lock_cancel_waiting_and_release()) and signal the tread which owns the transaction with condition variable.
What does it mean to cancel and release waiting lock?
First, it deletes waiting lock from the corresponding lock hash cell and rebuilds waiting graph, i.e. for each waiting lock in the cell it looks for conflicting lock among the cell elements, and, if such lock is found, makes the current element to be waiting for the found lock(lock_rec_dequeue_from_page()).
Then it sets trx->error_state if necessary and signals waiting thread with condition variable(see lock_wait_end()). It's supposed that either the waiting thread is already blocked on lock_wait() or it will invoke lock_wait(), which will return the corresponding error code, what let's the caller to initiate rollback. Rollback, in turns, releases all locks it holds.
We could simplify the above. We could just assign the corresponding value to trx->error_state and just signal transaction thread with condition variable to let the transaction thread to do all necessary things to release and cancel its locks.
Attachments
Issue Links
- relates to
-
MDEV-29323 Galera ha_abort_transaction is not honored if there are no InnoDB lock conflicts
-
- Open
-
-
MDEV-29622 Wrong assertions in lock_cancel_waiting_and_release() for deadlock resolving caller
-
- Closed
-
After spending a lot of time in debugging tests failures, I came to understanding
MDEV-29860is not a good idea. Mostly because waiting locks can be also waited ones. If we delay some lock releasing until its owning transaction rollback or commit, then the locks, which are waiting for that lock, must wait for that rollback or commit. This can impact performance. Besides, there can be the cases when some transaction tries to grant lock to the transaction, which is expected to be rolled back due to interruption or deadlock, because it's lock is still "waiting" until rollback cancels "waiting". Waiting queue does not reflect the real waiting status of the transactions, as some of them can be in the process of delayed locks releasing ifMDEV-29860is implemented. And this will also involve some additional checks when we rebuild waiting queue. The benefits are doubtful, as the code simplification in one place leads to complicating in another places. Now I think I was wrong when I proposed that simplification.