Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.6
Description
In MDEV-24789, we are minimizing the use of lock_sys.latch. It turns out that when the acquisition of an exclusive lock_sys.latch in innobase_kill_query() is replaced with an acquisition of a shared lock_sys.latch, a number of tests would occasionally hang:
- rpl.rpl_parallel_optimistic
- rpl.rpl_parallel_optimistic_xa_lsu_off
- rpl.rpl_parallel_optimistic_nobinlog
It seems that we can work around this bug by making innobase_kill_query() acquire an exclusive lock_sys.latch instead of a shared one. This work-around will obviously hurt performance, and I would think that it is merely reducing the probability of such hangs, instead of fixing them altogether. Until this bug is fixed, we can invoke the work-around whenever thd_need_wait_reports() holds.
Note: thd_need_wait_reports() holds even when no replication is being used, and only the option log_bin is enabled. That condition seems to be necessary, because without it, the test binlog.rpl_parallel_optimistic would hang (fall back to innodb_lock_wait_timeout).
Attachments
Issue Links
- is caused by
-
MDEV-24789 Performance regression after MDEV-24671
- Closed
- relates to
-
MDEV-24789 Performance regression after MDEV-24671
- Closed
-
MDEV-24948 thd_need_wait_reports() hurts performance
- Open