[MDEV-25016] Race condition between lock_sys_t::cancel() and page split or merge Created: 2021-03-01  Updated: 2021-03-09  Resolved: 2021-03-04

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6
Fix Version/s: 10.6.0

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: performance

Issue Links:
Problem/Incident
is caused by MDEV-24789 Performance regression after MDEV-24671 Closed
Relates
relates to MDEV-24789 Performance regression after MDEV-24671 Closed
relates to MDEV-24948 thd_need_wait_reports() hurts perform... Open

 Description   

In MDEV-24789, we are minimizing the use of lock_sys.latch. It turns out that when the acquisition of an exclusive lock_sys.latch in innobase_kill_query() is replaced with an acquisition of a shared lock_sys.latch, a number of tests would occasionally hang:

  • rpl.rpl_parallel_optimistic
  • rpl.rpl_parallel_optimistic_xa_lsu_off
  • rpl.rpl_parallel_optimistic_nobinlog

It seems that we can work around this bug by making innobase_kill_query() acquire an exclusive lock_sys.latch instead of a shared one. This work-around will obviously hurt performance, and I would think that it is merely reducing the probability of such hangs, instead of fixing them altogether. Until this bug is fixed, we can invoke the work-around whenever thd_need_wait_reports() holds.

Note: thd_need_wait_reports() holds even when no replication is being used, and only the option log_bin is enabled. That condition seems to be necessary, because without it, the test binlog.rpl_parallel_optimistic would hang (fall back to innodb_lock_wait_timeout).



 Comments   
Comment by Marko Mäkelä [ 2021-03-02 ]

To repeat this, apply the following patch:

diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc
index 223783a11bc..b5a8d3b7933 100644
--- a/storage/innobase/lock/lock0lock.cc
+++ b/storage/innobase/lock/lock0lock.cc
@@ -5444,6 +5444,7 @@ void lock_sys_t::cancel(trx_t *trx, lock_t *lock)
 /** Cancel a waiting lock request (if any) when killing a transaction */
 void lock_sys_t::cancel(trx_t *trx)
 {
+#undef HAVE_REPLICATION
 #ifdef HAVE_REPLICATION /* Work around MDEV-25016 (FIXME: remove this!) */
   /* Parallel replication tests would occasionally hang if we did not
   acquire exclusive lock_sys.latch here. This is not a real fix, but a

and run the following:

./mtr --parallel=auto --repeat=100 rpl.rpl_parallel_optimistic{,_xa_lsu_off,_nobinlog}{,,,,,,,,,,,,,}

On my system, I can observe that the CPU load will reduce as more and more tests will hang. Finally, one of the tests will time out. Without the patch, the tests will complete.

Comment by Marko Mäkelä [ 2021-03-02 ]

I am sorry for the false alarm. It turns out that my final fix of MDEV-24789 is too greedy after all. When releasing waiting record lock requests, we must hold exclusive lock_sys.latch to prevent the being-cancelled waiting lock request from being cloned and moved to another page during a concurrent page split or merge. This was something that was captured afterwards in an rr replay trace by mleich.

With that problem fixed, the 3 tests pass even if I remove my workaround.

Waiting table lock requests, which are only possible in special cases, can be released while holding only a shared lock_sys.latch.

Generated at Thu Feb 08 09:34:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.