[MDEV-14529] InnoDB rw-locks: optimize memory barriers Created: 2017-11-29  Updated: 2018-01-20  Resolved: 2017-12-08

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.3.2
Fix Version/s: 10.3.3

Type: Bug Priority: Minor
Reporter: Sandeep sethia Assignee: Sergey Vojtovich
Resolution: Fixed Votes: 0
Labels: None
Environment:

Ubuntu 16.02 on Arm 64 bit platform


Attachments: Text File memory_order.patch    
Issue Links:
Blocks
blocks MDEV-14442 Optimization for ARM64 platform. Open
Epic Link: arm64 optimization

 Description   

Hi,

I see __atomic_compare_exchange_n function is being called with different memory order while
doing the same type of atomic operation. In spin lock ACQUIRE AND RELAX being used

but in rw_lock_lock_word_decr function

i see __ATOMIC_SEQ_CST being hard coded in the cmpxchng. I feel we can use consistenly use ACQUIRE AND RELAX memory order for all the mutex implemenation in all the platform.



 Comments   
Comment by Sergey Vojtovich [ 2017-11-29 ]

Agree. Previously SEQ_CST was needed here as a workaround. Now with https://github.com/MariaDB/server/commit/fb7caad72b5c37e96c69aad63f9589f8b56855d6 this dependency is gone.

If we find that nothing else depends on SEQ_CST, we can certainly change it.

Comment by Sergey Vojtovich [ 2017-11-29 ]

grep -r my_atomic storage/innobase/|grep -v explicit

shows that there're still plenty of SEQ_CST's issued.

Comment by Sandeep sethia [ 2017-11-29 ]

I will do sysbench and mysqlslap benchmark and propose a patch for this .
Once the patch is given we do functional test on ur regression tool and confirm if it breaks or not

I am not sure if for add and subtract function we can remove SEQ_CST ? Any thoughts

Comment by Sergey Vojtovich [ 2017-11-29 ]

I'm pretty sure that we can remove SEQ_CST everywhere. But there can be silly code relying on it. We can analyse it if you feel it is important. Personally I'm all for getting rid of SEQ_CST.

Comment by Sandeep sethia [ 2017-11-30 ]

Attached is the patch which I have developed for removing SEQ_CST in few atomic operations.

For load and store I cannot remove SEQ_CST as the __ATOMIC_ACQ_REL is invalid memory_order.patch .

Also for add functions I have given __ATOMIC_ACQ_REL memory order as only option can be given .

Let me know your comments.

Comment by Sandeep sethia [ 2017-11-30 ]

I see 3-5% gain in mysqlslap benchmark for doing complex queries and 1-2% benefit in sysbench oltp workloads.
No functional issue seen

Comment by Sergey Vojtovich [ 2017-11-30 ]

I'm afraid memory_order.patch is not acceptable. Because, according to C11, non-explicit versions of atomic functions are supposed to issue SEQ_CST. We need to fix things case by case by changing to explicit version.

Comment by Sergey Vojtovich [ 2017-12-01 ]

I retarget this task to fix InnoDB rw-locks specifically, other memory barrier issues to be fixed separately.

Comment by Sandeep sethia [ 2017-12-01 ]

You want me to modify the patch and send it once again .

Comment by Sergey Vojtovich [ 2017-12-01 ]

ssethia, I already started working on it. It is a bit complicated by different issues, so what I plan to do is:
1. change lock_word from lint to int32_t: the latter is MariaDB atomic operations friendly type
2. remove volatile modifier from lock_word: it's not good for inter-thread communication, many places need to be fixed to use atomic_load(RELAXED) instead
3. relax memory barriers

Comment by Sergey Vojtovich [ 2017-12-01 ]

marko, could you review 5 patches on top of https://github.com/MariaDB/server/commits/bb-10.3-MDEV-14529 ?

Comment by Sergey Vojtovich [ 2017-12-01 ]

There're 7 patches now, top 2 are not of interested, the rest 5 are to be reviewed.

Comment by Sergey Vojtovich [ 2017-12-07 ]

marko, sorry now there're even more revisions. Please look at those that have MDEV-14529 in their title.

Comment by Marko Mäkelä [ 2017-12-07 ]

svoj, the patches look generally OK. I posted some minor comments. OK to push after addressing those.

Comment by Sergey Vojtovich [ 2017-12-08 ]

Pushed

commit c60095a818dce92838940525899a13a05633d148
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Thu Dec 7 18:36:01 2017 +0400
 
    Faster atomic loads and stores on windows
 
commit b04f2a0f01902d11e21afb690e14d12d2c464184
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Fri Dec 1 15:48:03 2017 +0400
 
    MDEV-14529 - InnoDB rw-locks: optimize memory barriers
 
    Relax memory barrier for lock_word.
 
    rw_lock_lock_word_decr() - used to acquire rw-lock, thus we only need to issue
    ACQUIRE when we succeed locking.
 
    rw_lock_x_lock_func_nowait() - same as above, but used to attempt to acquire
    X-lock.
 
    rw_lock_s_unlock_func() - used to release S-lock, RELEASE is what we need here.
 
    rw_lock_x_unlock_func() - used to release X-lock. Ideally we'd need only RELEASE
    here, but due to mess with waiters (they must be loaded after lock_word is
    stored) we have to issue both ACQUIRE and RELEASE.
 
    rw_lock_sx_unlock_func() - same as above, but used to release SX-lock.
 
    rw_lock_s_lock_spin(), rw_lock_x_lock_func(), rw_lock_sx_lock_func() -
    fetch-and-store to waiters has to issue only ACQUIRE memory barrier, so that
    waiters are stored before lock_word is loaded.
 
    Note that there is violation of RELEASE-ACQUIRE protocol here, because we do
    on lock:
 
      my_atomic_fas32_explicit((int32*) &lock->waiters, 1, MY_MEMORY_ORDER_ACQUIRE);
      my_atomic_load32_explicit(&lock->lock_word, MY_MEMORY_ORDER_RELAXED);
 
    on unlock
 
      my_atomic_add32_explicit(&lock->lock_word, X_LOCK_DECR, MY_MEMORY_ORDER_ACQ_REL);
      my_atomic_load32_explicit((int32*) &lock->waiters, MY_MEMORY_ORDER_RELAXED);
 
    That is we kind of synchronize ACQUIRE on lock_word with ACQUIRE on waiters.
    It was there before this patch. Simple fix may have negative performance impact.
    Proper fix requires refactoring of lock_word.
 
commit 51bb18f989d198aa648b2481865abd0734520d35
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Fri Dec 1 13:52:24 2017 +0400
 
    MDEV-14529 - InnoDB rw-locks: optimize memory barriers
 
    Relax memory barrier for waiters: these 2 stores must be completed before
    os_event_set() finishes. This is guaranteed by RELEASE barrier issued by
    mutex.exit() of os_event_set().
 
commit 5b624f00fc0e6fa0a5a676d3ec445f62c0fb75ec
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Fri Dec 1 13:37:07 2017 +0400
 
    MDEV-14529 - InnoDB rw-locks: optimize memory barriers
 
    Remove volatile modifier from waiters: it's not supposed for inter-thread
    communication, use appropriate atomic operations instead.
 
    Changed waiters to int32_t, my_atomic friendly type.
 
commit 57d20f1132df71e3b9aca998bcc31dfd62c942b3
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Fri Dec 1 13:14:42 2017 +0400
 
    MDEV-14529 - InnoDB rw-locks: optimize memory barriers
 
    Remove volatile modifier from lock_word: it's not supposed for inter-thread
    communication, use appropriate atomic operations instead.
 
commit c73e77da0fa0fbdb4be5bfa1f4e441b06d1d91f9
Author: Sergey Vojtovich <svoj@mariadb.org>
Date:   Fri Dec 1 12:02:51 2017 +0400
 
    MDEV-14529 - InnoDB rw-locks: optimize memory barriers
 
    Change lock_word from lint to int32_t: the latter is my_atomic_* friendly type.

Comment by Sergey Vojtovich [ 2017-12-08 ]

ssethia, rw-locks are now fixed in 10.3. If you still see performance improvement from your patch, then we should analyse other places where SEQ_CST is being issued.

Generated at Thu Feb 08 08:14:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.