Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26033

Race condition between buf_pool.page_hash and buffer pool resizing

Details

    Description

      The replacement of buf_pool.page_hash with a different type of hash table in MDEV-22871 introduced a race condition with buffer pool resizing.

      We have an execution trace where buf_pool.page_hash.array is changed to point to something else while page_hash_latch::read_lock() is executing. The same should also affect page_hash_latch::write_lock().

      The wait loop currently fails to notice that buffer pool resizing is in progress. A part of the problem is that we are waiting too deep in the code:

          template<bool exclusive> page_hash_latch *lock(ulint fold)
          {
            for (;;)
            {
              auto n= n_cells;
              page_hash_latch *latch= lock_get(fold, n);
              latch->acquire<exclusive>();
              /* Our latch prevents n_cells from changing. */
              if (UNIV_LIKELY(n == n_cells))
                return latch;
              /* Retry, because buf_pool_t::resize_hash() affected us. */
              latch->release<exclusive>();
            }
          }
      

      We are actually waiting inside page_hash_latch::read_lock() or page_hash_latch::write_lock() on memory that may no longer belong to the buf_pool.page_hash.array. We would need some notion of timeout or temporary failure when the buffer pool is being resized.

      Attachments

        1. grammar.yy
          0.8 kB
          Matthias Leich

        Issue Links

          Activity

            I considered a few options for a fix:

            1. Modify the code so that it polls buf_pool.resizing. As far as I understand, this might only reduce the probability of a race condition, but not completely prevent it. It would also have to be tested carefully for performance impact.
            2. Add a global rw-lock acquisition around the page_hash_latch acquisition, similar to lock_sys.latch in 10.6 (which mainly exists due to other purposes than buffer pool resizing). That would become an obvious scalability bottleneck.
            3. When resizing the buffer pool, never resize the buf_pool.page_hash table. This was the chosen solution, due to low risk and possibly improved performance for the usual case that the buffer pool is never being resized.
            marko Marko Mäkelä added a comment - I considered a few options for a fix: Modify the code so that it polls buf_pool.resizing . As far as I understand, this might only reduce the probability of a race condition, but not completely prevent it. It would also have to be tested carefully for performance impact. Add a global rw-lock acquisition around the page_hash_latch acquisition, similar to lock_sys.latch in 10.6 (which mainly exists due to other purposes than buffer pool resizing). That would become an obvious scalability bottleneck. When resizing the buffer pool, never resize the buf_pool.page_hash table. This was the chosen solution, due to low risk and possibly improved performance for the usual case that the buffer pool is never being resized.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.