[MDEV-32530] Race condition in lock_wait_rpl_report() Created: 2023-10-20 Updated: 2023-11-08 Resolved: 2023-10-24 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Locking, Storage Engine - InnoDB |
| Affects Version/s: | N/A |
| Fix Version/s: | 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3, 11.2.2 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Marko Mäkelä | Assignee: | Vladislav Lesin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | race, regression, rr-profile-analyzed | ||
| Environment: |
GNU/Linux, rr 5.6.0 |
||
| Issue Links: |
|
||||||||||||
| Description |
|
In the rr replay trace, the problem occurs while we are able to acquire exclusive lock_sys.latch without waiting. The following patch should fix this:
While this function was about to enter lock_sys.wr_lock_try(), another thread had updated the lock while holding a shared lock_sys.latch. The lock_sys.latch happened to have been released by the time lock_sys.wr_lock_try() executed the std::atomic::compare_exchange_strong() on the lock word, so the exclusive lock_sys.latch was granted without waiting. In lock_sys_t::cancel() there is a similar lock_sys.wr_lock_try() pattern on record locks (which can be modified by other threads), but it is correctly reloading trx->lock.wait_lock after acquiring the lock_sys.latch. |
| Comments |
| Comment by Vladislav Lesin [ 2023-10-24 ] |
|
Looks good to me. |