Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24167

InnoDB unnecessarily uses complex rw-lock implementation

Details

    Description

      InnoDB implements its own complex rw-latch whose special features (recursive X-locks and SX-locks) are only needed for buf_block_t::lock and dict_index_t::lock.

      It would be tempting to use the lightweight wrapper mysql_rwlock_t, but unfortunately it would introduce a performance regression on Linux. On Microsoft Windows, we can use a straightforward wrapper of SRWLOCK. On Linux, a std::atomic<uint32_t> and a futex will do. On anything else, we can use mysql_rwlock_t.

      Attachments

        Issue Links

          Activity

            On my Debian GNU/Linux system, sizeof(mysql_rwlock_t) is 64 bytes, while sizeof(rw_lock_t) is 160 bytes or 88 bytes, depending on whether a debug build is being used.
            The system rw-lock does not support recursive locking. InnoDB’s reimplementation supports that for the exclusive and shared-exclusive modes (possibly, to be simplified in MDEV-24142).

            marko Marko Mäkelä added a comment - On my Debian GNU/Linux system, sizeof(mysql_rwlock_t) is 64 bytes, while sizeof(rw_lock_t) is 160 bytes or 88 bytes, depending on whether a debug build is being used. The system rw-lock does not support recursive locking. InnoDB’s reimplementation supports that for the exclusive and shared-exclusive modes (possibly, to be simplified in MDEV-24142 ).

            For contended rw-locks, such as the adaptive hash index latches, the custom implementation is measurably faster than mysql_rwlock_t. Hence, it makes sense to improve the custom implementation, in MDEV-24142.

            marko Marko Mäkelä added a comment - For contended rw-locks, such as the adaptive hash index latches, the custom implementation is measurably faster than mysql_rwlock_t . Hence, it makes sense to improve the custom implementation, in MDEV-24142 .

            For all InnoDB rw-locks except those that will be addressed in MDEV-24142, we can use a lightweight wrapper of Microsoft Windows SRWLOCK or std::atomic<uint32_t> and Linux futex. On other operating systems, we can use mysql_rwlock_t.

            marko Marko Mäkelä added a comment - For all InnoDB rw-locks except those that will be addressed in MDEV-24142 , we can use a lightweight wrapper of Microsoft Windows SRWLOCK or std::atomic<uint32_t> and Linux futex. On other operating systems, we can use mysql_rwlock_t .
            marko Marko Mäkelä added a comment - - edited

            My std::atomic<uint32_t> based implementation was improved in MDEV-24271: rd_lock() must perform compare-and-swap instead of doing fetch_add() and then backing off if a conflicting lock was held or requested. The current implementation is causing hangs if the server is heavily overloaded, or running under rr record.

            marko Marko Mäkelä added a comment - - edited My std::atomic<uint32_t> based implementation was improved in MDEV-24271 : rd_lock() must perform compare-and-swap instead of doing fetch_add() and then backing off if a conflicting lock was held or requested. The current implementation is causing hangs if the server is heavily overloaded, or running under rr record .

            To prepare for MDEV-24142 I had replaced the SRWLOCK implementation on Microsoft Windows with atomic-and-futex-based implementation. According to wlad, that did not perform well. So, I separated ssux_lock and srw_lock. The srw_lock is based on Windows SRWLOCK, or outside Windows on rw_lock_t when SRW_LOCK_DUMMY is defined). The non-recursive ssux_lock is only being used as a building block of the recursive MDEV-24142 sux_lock, which is only being used for dict_index_t::lock and buf_block_t::lock.

            marko Marko Mäkelä added a comment - To prepare for MDEV-24142 I had replaced the SRWLOCK implementation on Microsoft Windows with atomic-and-futex-based implementation . According to wlad , that did not perform well. So, I separated ssux_lock and srw_lock . The srw_lock is based on Windows SRWLOCK , or outside Windows on rw_lock_t when SRW_LOCK_DUMMY is defined). The non-recursive ssux_lock is only being used as a building block of the recursive MDEV-24142 sux_lock , which is only being used for dict_index_t::lock and buf_block_t::lock .

            Especially on Microsoft Windows (probably due to the hardware configuration rather than the operating system), we observed hung ssux_lock::update_lock(), mainly during the test mariabackup.xb_compressed_encrypted. Making ssux_lock::u_unlock() always wake up all waiters seemed to help. I suspect that the issue simply was that if multiple concurrent update_lock() ended up waiting, not all of them would be woken up. For X-locks, this problem was prevented by having a separate ‘waiting’ flag. For U-locks, we do not have such a flag.

            For some reason, the issue was not observable before we replaced the InnoDB homebrew mutex implementation with system mutexes in MDEV-24142.

            marko Marko Mäkelä added a comment - Especially on Microsoft Windows (probably due to the hardware configuration rather than the operating system), we observed hung ssux_lock::update_lock() , mainly during the test mariabackup.xb_compressed_encrypted . Making ssux_lock::u_unlock() always wake up all waiters seemed to help. I suspect that the issue simply was that if multiple concurrent update_lock() ended up waiting, not all of them would be woken up. For X-locks, this problem was prevented by having a separate ‘waiting’ flag. For U-locks, we do not have such a flag. For some reason, the issue was not observable before we replaced the InnoDB homebrew mutex implementation with system mutexes in MDEV-24142 .

            An unrelated cause of mariabackup.xb_compression_encrypted test timeouts was that the innodb_encryption_threads=4 were fighting each other and causing contention. Before MDEV-24142, the spinloop of the homebrew mutex saved the day. System mutexes seem to have significant overhead in the case of contention, at least on Microsoft Windows.

            marko Marko Mäkelä added a comment - An unrelated cause of mariabackup.xb_compression_encrypted test timeouts was that the innodb_encryption_threads=4 were fighting each other and causing contention . Before MDEV-24142 , the spinloop of the homebrew mutex saved the day. System mutexes seem to have significant overhead in the case of contention, at least on Microsoft Windows.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.