Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26033

Race condition between buf_pool.page_hash and buffer pool resizing

    XMLWordPrintable

Details

    Description

      The replacement of buf_pool.page_hash with a different type of hash table in MDEV-22871 introduced a race condition with buffer pool resizing.

      We have an execution trace where buf_pool.page_hash.array is changed to point to something else while page_hash_latch::read_lock() is executing. The same should also affect page_hash_latch::write_lock().

      The wait loop currently fails to notice that buffer pool resizing is in progress. A part of the problem is that we are waiting too deep in the code:

          template<bool exclusive> page_hash_latch *lock(ulint fold)
          {
            for (;;)
            {
              auto n= n_cells;
              page_hash_latch *latch= lock_get(fold, n);
              latch->acquire<exclusive>();
              /* Our latch prevents n_cells from changing. */
              if (UNIV_LIKELY(n == n_cells))
                return latch;
              /* Retry, because buf_pool_t::resize_hash() affected us. */
              latch->release<exclusive>();
            }
          }
      

      We are actually waiting inside page_hash_latch::read_lock() or page_hash_latch::write_lock() on memory that may no longer belong to the buf_pool.page_hash.array. We would need some notion of timeout or temporary failure when the buffer pool is being resized.

      Attachments

        1. grammar.yy
          0.8 kB
          Matthias Leich

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.