Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32530

Race condition in lock_wait_rpl_report()

Details

    Description

      MDEV-32096 introduced an incorrect optimization to lock_wait_rpl_report(), which manifested itself as the following debug assertion failure:

      lock/lock0lock.cc:2068: dberr_t lock_wait(que_thr_t*): Assertion `!wait_lock == !trx->lock.wait_lock' failed.
      

      In the rr replay trace, the problem occurs while we are able to acquire exclusive lock_sys.latch without waiting. The following patch should fix this:

      diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc
      index 31e02d2451a..df51ceb16d8 100644
      --- a/storage/innobase/lock/lock0lock.cc
      +++ b/storage/innobase/lock/lock0lock.cc
      @@ -1812,8 +1812,14 @@ static lock_t *lock_wait_rpl_report(trx_t *trx)
         }
         else if (!wait_lock->is_waiting())
         {
      -    wait_lock= nullptr;
      -    goto func_exit;
      +    wait_lock= trx->lock.wait_lock;
      +    if (!wait_lock)
      +      goto func_exit;
      +    if (!wait_lock->is_waiting())
      +    {
      +      wait_lock= nullptr;
      +      goto func_exit;
      +    }
         }
       
         if (wait_lock->is_table())
      

      While this function was about to enter lock_sys.wr_lock_try(), another thread had updated the lock while holding a shared lock_sys.latch. The lock_sys.latch happened to have been released by the time lock_sys.wr_lock_try() executed the std::atomic::compare_exchange_strong() on the lock word, so the exclusive lock_sys.latch was granted without waiting.

      In lock_sys_t::cancel() there is a similar lock_sys.wr_lock_try() pattern on record locks (which can be modified by other threads), but it is correctly reloading trx->lock.wait_lock after acquiring the lock_sys.latch.

      Attachments

        Issue Links

          Activity

            marko Marko Mäkelä created issue -
            marko Marko Mäkelä made changes -
            Field Original Value New Value
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Vladislav Lesin [ vlad.lesin ]
            Status In Progress [ 3 ] In Review [ 10002 ]

            Looks good to me.

            vlad.lesin Vladislav Lesin added a comment - Looks good to me.
            vlad.lesin Vladislav Lesin made changes -
            Status In Review [ 10002 ] Stalled [ 10000 ]
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2023-10-24 12:15:04.0 2023-10-24 12:15:03.754
            marko Marko Mäkelä made changes -
            Fix Version/s 10.6.16 [ 29014 ]
            Fix Version/s 10.10.7 [ 29018 ]
            Fix Version/s 10.11.6 [ 29020 ]
            Fix Version/s 11.0.4 [ 29021 ]
            Fix Version/s 11.1.3 [ 29023 ]
            Fix Version/s 11.2.2 [ 29035 ]
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.10 [ 27530 ]
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 11.0 [ 28320 ]
            Fix Version/s 11.1 [ 28549 ]
            Fix Version/s 11.2 [ 28603 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            marko Marko Mäkelä made changes -

            People

              vlad.lesin Vladislav Lesin
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.