Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.5
Description
The changes that were introduced in MDEV-15053 are sometimes causing severe performance loss when there is contention on buf_pool.page_hash. At the root we have the invocation of rw_lock_lock_word_decr(lock, 1, 0)) in rw_lock_s_lock_low(). That function is looping:
UNIV_INLINE
|
bool
|
rw_lock_lock_word_decr(
|
/*===================*/
|
rw_lock_t* lock, /*!< in/out: rw-lock */ |
int32_t amount, /*!< in: amount to decrement */ |
int32_t threshold) /*!< in: threshold of judgement */ |
{
|
int32_t lock_copy = lock->lock_word;
|
|
while (lock_copy > threshold) { |
if (lock->lock_word.compare_exchange_strong( |
lock_copy,
|
lock_copy - amount,
|
std::memory_order_acquire,
|
std::memory_order_relaxed)) {
|
|
return(true); |
}
|
(void) LF_BACKOFF(); |
}
|
return(false); |
}
|
Note: the LF_BACKOFF() was added in an attempt to reduce the problem.
It feels strange that we are first loading lock_copy and then insisting that we will get an exact match on that lock_copy value. It seems that this could cause unnecessary contention between non-conflicting S-latch requests. The following naïve patch should alleviate that:
diff --git a/storage/innobase/include/sync0rw.ic b/storage/innobase/include/sync0rw.ic
|
index 18a6c69ece4..cec796c8892 100644
|
--- a/storage/innobase/include/sync0rw.ic
|
+++ b/storage/innobase/include/sync0rw.ic
|
@@ -209,20 +209,18 @@ rw_lock_lock_word_decr(
|
int32_t amount, /*!< in: amount to decrement */
|
int32_t threshold) /*!< in: threshold of judgement */
|
{
|
- int32_t lock_copy = lock->lock_word;
|
-
|
- while (lock_copy > threshold) {
|
- if (lock->lock_word.compare_exchange_strong(
|
- lock_copy,
|
- lock_copy - amount,
|
- std::memory_order_acquire,
|
- std::memory_order_relaxed)) {
|
+ for (;;)
|
+ {
|
+ int32_t lock_copy= lock->lock_word;
|
|
- return(true);
|
- }
|
- (void) LF_BACKOFF();
|
- }
|
- return(false);
|
+ if (lock_copy <= threshold)
|
+ return false;
|
+ if (lock->lock_word.compare_exchange_strong(lock_copy, lock_copy - amount,
|
+ std::memory_order_acquire,
|
+ std::memory_order_relaxed))
|
+ return true;
|
+ (void) LF_BACKOFF();
|
+ }
|
}
|
|
/******************************************************************//** |
For even better results, we might want to rewrite the S-latch acquisition, as in the following incomplete patch. At least rw_lock_x_unlock_func() would have to be adjusted as well:
diff --git a/storage/innobase/include/sync0rw.ic b/storage/innobase/include/sync0rw.ic
|
index 18a6c69ece4..32ad6497a35 100644
|
--- a/storage/innobase/include/sync0rw.ic
|
+++ b/storage/innobase/include/sync0rw.ic
|
@@ -239,8 +239,9 @@ rw_lock_s_lock_low(
|
const char* file_name, /*!< in: file name where lock requested */
|
unsigned line) /*!< in: line where requested */
|
{
|
- if (!rw_lock_lock_word_decr(lock, 1, 0)) {
|
+ if (lock->lock_word.fetch_sub(1, std::memory_order_acquire) <= 0) {
|
/* Locking did not succeed */
|
+ lock->lock_word.fetch_add(1, std::memory_order_release);
|
return(FALSE);
|
}
|
|
Also, somewhat related to this, we might want to replace the rw_lock_s_lock(l) call in buf_pool_t::page_hash_lock() with a non-inlined function call, because that code is extremely unlikely to be executed, only when the buffer pool is being resized.
Attachments
Issue Links
- relates to
-
MDEV-19929 Add a startup message about PAUSE instruction timing
- In Review
-
MDEV-22859 Possible hang in InnoDB rw_lock_lock_word_decr()
- Closed
-
MDEV-22871 Contention on the buf_pool.page_hash
- Closed
-
MDEV-14659 Innodb scalibility issue found in Mariadb code for complex 'select' queries in Arm platform
- Closed
-
MDEV-15053 Reduce buf_pool_t::mutex contention
- Closed