Details

    • 10.3.6-1, 10.4.0-1

    Description

      MySQL 8.0.0 split the InnoDB buf_pool_t::mutex. MariaDB should do something similar.

      Instead of introducing more mutexes or radically changing the latching rules of various buf_pool_t and buf_block_t data members, I think that it is possible to reduce the contention on buf_pool.mutex by other means:

      • Move more code to inline functions of buf_pool_t or buf_page_t.
      • Reduce the amount of mutex release/reacquire dance in buf0flu.cc and buf0rea.cc.
      • Avoid repeated calls to page_id_t::fold() or page_id_t::page_id_t(); use page_id_t directly as loop iterator.
      • Move buf_page_t::flush_type to IORequest.
      • Split buf_page_io_complete() into separate ‘read completion’ and ‘write completion’ callbacks.
      • Avoid holding buf_pool.mutex during buf_pool.page_hash operations. Consider removing the debug field buf_page_t::in_page_hash.
      • Split operations on buf_pool.watch[] into two parts. The allocation of buf_pool.watch[] should be only protected by buf_pool.mutex, and the buf_pool.page_hash only by the hash bucket latch.

      Attachments

        Issue Links

          Activity

            Result of RQG testing on origin/bb-10.5-marko commit ab5fe285276ab2dc4d87fb59a31655db51706da5
            The binaries build from this tree were neither better nor worse than actual 10.5.

            mleich Matthias Leich added a comment - Result of RQG testing on origin/bb-10.5-marko commit ab5fe285276ab2dc4d87fb59a31655db51706da5 The binaries build from this tree were neither better nor worse than actual 10.5.

            Result of RQG testing on origin/bb-10.5-MDEV-15053-3 577ac7de0b26e595bc7bf678c25a65a0b4c92edc 2020-05-27T16:05:27+03:00
            The binaries build from this tree were neither better nor worse than actual 10.5.

            mleich Matthias Leich added a comment - Result of RQG testing on origin/bb-10.5- MDEV-15053 -3 577ac7de0b26e595bc7bf678c25a65a0b4c92edc 2020-05-27T16:05:27+03:00 The binaries build from this tree were neither better nor worse than actual 10.5.

            kevg, after your review I still fixed one thing that probably caused the occasional hangs that axel reproduced in buf_page_init_for_read(). I had wrongly released the buf_pool.page_hash latch before buf_page_t::set_io_fix(BUF_IO_READ) was called. This was a functional change from earlier.

            marko Marko Mäkelä added a comment - kevg , after your review I still fixed one thing that probably caused the occasional hangs that axel reproduced in buf_page_init_for_read() . I had wrongly released the buf_pool.page_hash latch before buf_page_t::set_io_fix(BUF_IO_READ) was called. This was a functional change from earlier.

            The hang was very elusive, but finally I found the reason. Now that we removed buf_block_t::mutex, we must release buf_block_t::lock before invoking buf_block_t::unfix() or buf_block_t::io_unfix(). Otherwise, something bad will happen somewhere, causing a block in the buf_pool.free list to permanently remain in X-latched state, with block->lock.writer_thread==0. We only experienced the hang in a release build, and it never reproduced under rr record. In the end, I disabled buf_page_optimistic_get() and added debug assertions on buf_block_t::lock when the buf_pool.free list is modified. To be able to do that, I had to fix a glitch in dict_check_sys_tables().

            marko Marko Mäkelä added a comment - The hang was very elusive, but finally I found the reason. Now that we removed buf_block_t::mutex , we must release buf_block_t::lock before invoking buf_block_t::unfix() or buf_block_t::io_unfix() . Otherwise, something bad will happen somewhere, causing a block in the buf_pool.free list to permanently remain in X-latched state, with block->lock.writer_thread==0 . We only experienced the hang in a release build, and it never reproduced under rr record . In the end, I disabled buf_page_optimistic_get() and added debug assertions on buf_block_t::lock when the buf_pool.free list is modified. To be able to do that, I had to fix a glitch in dict_check_sys_tables() .

            To reduce contention especially in read-only workloads, we increased svr_n_page_hash_locks from 16 to 64 and added LF_BACKOFF() to the spin loop in rw_lock_lock_word_decr().

            marko Marko Mäkelä added a comment - To reduce contention especially in read-only workloads, we increased svr_n_page_hash_locks from 16 to 64 and added LF_BACKOFF() to the spin loop in rw_lock_lock_word_decr() .

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.