[MDEV-26720] Suboptimal translation of single-bit std::atomic::fetch_or() and fetch_and() Created: 2021-09-29 Updated: 2022-01-28 Resolved: 2021-10-03 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.5, 10.6 |
| Fix Version/s: | 10.5.13, 10.6.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | performance | ||
| Environment: |
IA-32, AMD64, GCC, clang (and derivatives such as Xcode, icc), Microsoft C compiler |
||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
This is actually a compiler bug. The Intel 80386 processor introduced some bit operations that would be the perfect translation for atomic single-bit read-modify-and-write operations. Alas, even the latest compilers as of today (GCC, clang, Microsoft C compiler) would generate a loop around LOCK CMPXCHG instead of emitting the instructions LOCK BTS (fetch_or()), LOCK BTR (fetch_and()), LOCK BTC (fetch_xor()). We have several single-bit fetch_and() and fetch_or() operations that can be optimized to LOCK BTS or LOCK BTR, similar to (this fix of |