Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26769

InnoDB internal latches do not support hardware lock elision

Details

    Description

      The synchronization primitives implemented in operating system libraries such as GNU libc may support lock elision via transactional memory when supported by the underlying hardware. The native primitives in InnoDB are currently missing that.

      Lock elision could avoid unnecessary memory cache pollution in case a mutex is not contended. For example, when reading or sometimes even writing buf_pool.page_hash or lock_sys.rec_hash, we could avoid modifying cache lines only for the purpose of acquiring and releasing latches.

      Attachments

        Issue Links

          Activity

            The throughput and latency are fluctuating a lot in our performance testing. We’d better complete MDEV-26779, MDEV-26827, MDEV-26828 first.

            marko Marko Mäkelä added a comment - The throughput and latency are fluctuating a lot in our performance testing. We’d better complete MDEV-26779 , MDEV-26827 , MDEV-26828 first.
            axel Axel Schwenke added a comment -

            MDEV-26769-1.pdf shows no regression between mariadb-10.6-d6a3f425ee2 and
            mariadb-10.6-MDEV-26769-4d33865d9eb

            axel Axel Schwenke added a comment - MDEV-26769-1.pdf shows no regression between mariadb-10.6-d6a3f425ee2 and mariadb-10.6- MDEV-26769 -4d33865d9eb

            Note: MDEV-26769-1.pdf does show a small regression for write workloads. That branch included also MDEV-26826, MDEV-26827, MDEV-26828. Further tests indicated that MDEV-26827 in its current form is not improving performance, but slightly reducing it. So, we postponed that improvement.

            marko Marko Mäkelä added a comment - Note: MDEV-26769-1.pdf does show a small regression for write workloads. That branch included also MDEV-26826 , MDEV-26827 , MDEV-26828 . Further tests indicated that MDEV-26827 in its current form is not improving performance, but slightly reducing it. So, we postponed that improvement.
            marko Marko Mäkelä added a comment - - edited

            If rr replay fails to work, and

            grep -lw rtm /proc/cpuinfo
            

            returns a nonempty result on your IA-32 or AMD64 processor, you should disable the RTM detection. In rr 5.5.0 that seems to happen automatically, and you should not see a startup message like this in the server error log that was written during rr record:

            2021-11-01  8:22:55 0 [Note] InnoDB: Using transactional memory
            

            In rr 5.4.0 you may have to explicitly enable the CPUID faulting:

            ./mtr --rr=--disable-cpuid-features-ext=0x800
            

            or

            rr record --disable-cpuid-features-ext=0x800 sql/mariadbd
            

            In a virtual machine that runs Ubuntu 20.04, this caused an error:

            [FATAL /home/roc/rr/rr/src/RecordSession.cc:2257:RecordSession() errno: EOPNOTSUPP] CPUID faulting required to disable CPUID features
            

            To work around this, you may disable the CPUID check in the source code instead:

            diff --git a/storage/innobase/sync/srw_lock.cc b/storage/innobase/sync/srw_lock.cc
            index b54191d91b0..d3da62f67ab 100644
            --- a/storage/innobase/sync/srw_lock.cc
            +++ b/storage/innobase/sync/srw_lock.cc
            @@ -41,12 +41,14 @@ bool transactional_lock_enabled()
             bool have_transactional_memory;
             bool transactional_lock_enabled()
             {
            +#if 0
               if (__get_cpuid_max(0, nullptr) < 7)
                 return false;
               unsigned eax, ebx, ecx, edx;
               __cpuid_count(7, 0, eax, ebx, ecx, edx);
               /* Restricted Transactional Memory (RTM) */
               have_transactional_memory= ebx & 1U << 11;
            +#endif
               return have_transactional_memory;
             }
             
            

            marko Marko Mäkelä added a comment - - edited If rr replay fails to work, and grep -lw rtm /proc/cpuinfo returns a nonempty result on your IA-32 or AMD64 processor, you should disable the RTM detection. In rr 5.5.0 that seems to happen automatically, and you should not see a startup message like this in the server error log that was written during rr record : 2021-11-01 8:22:55 0 [Note] InnoDB: Using transactional memory In rr 5.4.0 you may have to explicitly enable the CPUID faulting: ./mtr --rr=--disable-cpuid-features-ext=0x800 or rr record --disable-cpuid-features-ext=0x800 sql/mariadbd In a virtual machine that runs Ubuntu 20.04, this caused an error: [FATAL /home/roc/rr/rr/src/RecordSession.cc:2257:RecordSession() errno: EOPNOTSUPP] CPUID faulting required to disable CPUID features To work around this, you may disable the CPUID check in the source code instead: diff --git a/storage/innobase/sync/srw_lock.cc b/storage/innobase/sync/srw_lock.cc index b54191d91b0..d3da62f67ab 100644 --- a/storage/innobase/sync/srw_lock.cc +++ b/storage/innobase/sync/srw_lock.cc @@ -41,12 +41,14 @@ bool transactional_lock_enabled() bool have_transactional_memory; bool transactional_lock_enabled() { +#if 0 if (__get_cpuid_max(0, nullptr) < 7) return false; unsigned eax, ebx, ecx, edx; __cpuid_count(7, 0, eax, ebx, ecx, edx); /* Restricted Transactional Memory (RTM) */ have_transactional_memory= ebx & 1U << 11; +#endif return have_transactional_memory; }

            We observe a build regression on macOS:

            [ 29%] Building CXX object sql/CMakeFiles/wsrep.dir/wsrep_xid.cc.o
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/btr/btr0btr.cc:28:
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/btr0btr.h:31:
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/dict0dict.h:32:
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/dict0mem.h:45:
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0buf.h:43:
            /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/transactional_lock_guard.h:145:14: error: 'is_write_locked' is a private member of 'rw_lock'
                  if (!m.is_write_locked())
                         ^
            /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0buf.h:1540:54: note: in instantiation of member function 'transactional_shared_lock_guard<page_hash_latch>::transactional_shared_lock_guard' requested here
                transactional_shared_lock_guard<page_hash_latch> g
                                                                 ^
            /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0types.h:182:25: note: constrained by private inheritance here
            class page_hash_latch : private rw_lock
                                    ^~~~~~~~~~~~~~~
            /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/rw_lock.h:225:8: note: member is declared here
              bool is_write_locked() const { return !!(value() & WRITER); }
                   ^
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/btr/btr0btr.cc:28:
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/btr0btr.h:31:
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/dict0dict.h:32:
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/dict0mem.h:45:
            In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0buf.h:43:
            /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/transactional_lock_guard.h:145:12: error: cannot cast 'page_hash_latch' to its private base class 'const rw_lock'
                  if (!m.is_write_locked())
                       ^
            /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0types.h:182:25: note: declared private here
            class page_hash_latch : private rw_lock
                                    ^~~~~~~~~~~~~~~
            [ 29%] Building C object storage/heap/CMake
            

            It seems to come from
            https://github.com/MariaDB/server/commit/1f02280904fcfbb2bd86404d1c85c025634f8c9d

            Partially reverting it helps:

            diff -ru a/storage/innobase/include/buf0types.h b/storage/innobase/include/buf0types.h
            --- a/storage/innobase/include/buf0types.h	2021-11-17 02:37:32.000000000 -0500
            +++ b/storage/innobase/include/buf0types.h	2021-11-17 02:38:50.000000000 -0500
            @@ -179,7 +179,7 @@
             #include "sux_lock.h"
             
             #ifdef SUX_LOCK_GENERIC
            -class page_hash_latch : private rw_lock
            +class page_hash_latch : public rw_lock
             {
               /** Wait for a shared lock */
               void read_lock_wait();
            
            

            veprbl Dmitry Kalinkin added a comment - We observe a build regression on macOS: [ 29%] Building CXX object sql/CMakeFiles/wsrep.dir/wsrep_xid.cc.o In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/btr/btr0btr.cc:28: In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/btr0btr.h:31: In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/dict0dict.h:32: In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/dict0mem.h:45: In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0buf.h:43: /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/transactional_lock_guard.h:145:14: error: 'is_write_locked' is a private member of 'rw_lock' if (!m.is_write_locked()) ^ /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0buf.h:1540:54: note: in instantiation of member function 'transactional_shared_lock_guard<page_hash_latch>::transactional_shared_lock_guard' requested here transactional_shared_lock_guard<page_hash_latch> g ^ /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0types.h:182:25: note: constrained by private inheritance here class page_hash_latch : private rw_lock ^~~~~~~~~~~~~~~ /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/rw_lock.h:225:8: note: member is declared here bool is_write_locked() const { return !!(value() & WRITER); } ^ In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/btr/btr0btr.cc:28: In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/btr0btr.h:31: In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/dict0dict.h:32: In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/dict0mem.h:45: In file included from /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0buf.h:43: /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/transactional_lock_guard.h:145:12: error: cannot cast 'page_hash_latch' to its private base class 'const rw_lock' if (!m.is_write_locked()) ^ /tmp/nix-build-mariadb-server-10.6.5.drv-0/source/storage/innobase/include/buf0types.h:182:25: note: declared private here class page_hash_latch : private rw_lock ^~~~~~~~~~~~~~~ [ 29%] Building C object storage/heap/CMake It seems to come from https://github.com/MariaDB/server/commit/1f02280904fcfbb2bd86404d1c85c025634f8c9d Partially reverting it helps: diff -ru a/storage/innobase/include/buf0types.h b/storage/innobase/include/buf0types.h --- a/storage/innobase/include/buf0types.h 2021-11-17 02:37:32.000000000 -0500 +++ b/storage/innobase/include/buf0types.h 2021-11-17 02:38:50.000000000 -0500 @@ -179,7 +179,7 @@ #include "sux_lock.h" #ifdef SUX_LOCK_GENERIC -class page_hash_latch : private rw_lock +class page_hash_latch : public rw_lock { /** Wait for a shared lock */ void read_lock_wait();

            veprbl, thank you for the observation. I can repeat the compilation failure with the following patch and clang 13:

            diff --git a/storage/innobase/include/rw_lock.h b/storage/innobase/include/rw_lock.h
            index 0ae052fabe2..4a4eba7df0b 100644
            --- a/storage/innobase/include/rw_lock.h
            +++ b/storage/innobase/include/rw_lock.h
            @@ -22,7 +22,7 @@ this program; if not, write to the Free Software Foundation, Inc.,
             
             #if !(defined __linux__ || defined __OpenBSD__ || defined _WIN32)
             # define SUX_LOCK_GENERIC
            -#elif 0 // defined SAFE_MUTEX
            +#elif 1 // defined SAFE_MUTEX
             # define SUX_LOCK_GENERIC /* Use dummy implementation for debugging purposes */
             #endif
             
            

            Would the following patch work for you?

            diff --git a/storage/innobase/include/srw_lock.h b/storage/innobase/include/srw_lock.h
            index 54d042419ca..23d8e35aa7b 100644
            --- a/storage/innobase/include/srw_lock.h
            +++ b/storage/innobase/include/srw_lock.h
            @@ -137,6 +137,7 @@ class ssux_lock_impl final
               void destroy();
               /** @return whether any writer is waiting */
               bool is_waiting() const { return (value() & WRITER_WAITING) != 0; }
            +  bool is_write_locked() const { return rw_lock::is_write_locked(); }
             
               bool rd_lock_try() { uint32_t l; return read_trylock(l); }
               bool wr_lock_try() { return write_trylock(); }
            

            For what it is worth, https://shift.click/blog/futex-like-apis/ claims that Mac OS X supports a futex-like API. Maybe you would like to contribute a partial fix of MDEV-26476?

            marko Marko Mäkelä added a comment - veprbl , thank you for the observation. I can repeat the compilation failure with the following patch and clang 13: diff --git a/storage/innobase/include/rw_lock.h b/storage/innobase/include/rw_lock.h index 0ae052fabe2..4a4eba7df0b 100644 --- a/storage/innobase/include/rw_lock.h +++ b/storage/innobase/include/rw_lock.h @@ -22,7 +22,7 @@ this program; if not, write to the Free Software Foundation, Inc., #if !(defined __linux__ || defined __OpenBSD__ || defined _WIN32) # define SUX_LOCK_GENERIC -#elif 0 // defined SAFE_MUTEX +#elif 1 // defined SAFE_MUTEX # define SUX_LOCK_GENERIC /* Use dummy implementation for debugging purposes */ #endif Would the following patch work for you? diff --git a/storage/innobase/include/srw_lock.h b/storage/innobase/include/srw_lock.h index 54d042419ca..23d8e35aa7b 100644 --- a/storage/innobase/include/srw_lock.h +++ b/storage/innobase/include/srw_lock.h @@ -137,6 +137,7 @@ class ssux_lock_impl final void destroy(); /** @return whether any writer is waiting */ bool is_waiting() const { return (value() & WRITER_WAITING) != 0; } + bool is_write_locked() const { return rw_lock::is_write_locked(); } bool rd_lock_try() { uint32_t l; return read_trylock(l); } bool wr_lock_try() { return write_trylock(); } For what it is worth, https://shift.click/blog/futex-like-apis/ claims that Mac OS X supports a futex-like API. Maybe you would like to contribute a partial fix of MDEV-26476 ?
            veprbl Dmitry Kalinkin added a comment - - edited

            @marko , your patch doesn't fix the error for me, and I don't quite get why you are patching a different header file. At this time I can't contribute futex support for macOS, but I appreciate your kind suggestion.

            veprbl Dmitry Kalinkin added a comment - - edited @marko , your patch doesn't fix the error for me, and I don't quite get why you are patching a different header file. At this time I can't contribute futex support for macOS, but I appreciate your kind suggestion.

            veprbl, sorry, indeed it is a different header file that you had to patch. The problem that you reported is repeatable with CMAKE_BUILD_TYPE=RelWithDebInfo but not CMAKE_BUILD_TYPE=Debug, which I used. Would this work for you?

            diff --git a/storage/innobase/include/buf0types.h b/storage/innobase/include/buf0types.h
            index 2cb92a5f1df..d3ee0b42169 100644
            --- a/storage/innobase/include/buf0types.h
            +++ b/storage/innobase/include/buf0types.h
            @@ -191,10 +191,8 @@ class page_hash_latch : private rw_lock
               /** Acquire an exclusive lock */
               inline void lock();
             
            -#ifdef UNIV_DEBUG
               /** @return whether an exclusive lock is being held by any thread */
               bool is_write_locked() const { return rw_lock::is_write_locked(); }
            -#endif
             
               /** @return whether any lock is being held by any thread */
               bool is_locked() const { return rw_lock::is_locked(); }
            

            marko Marko Mäkelä added a comment - veprbl , sorry, indeed it is a different header file that you had to patch. The problem that you reported is repeatable with CMAKE_BUILD_TYPE=RelWithDebInfo but not CMAKE_BUILD_TYPE=Debug , which I used. Would this work for you? diff --git a/storage/innobase/include/buf0types.h b/storage/innobase/include/buf0types.h index 2cb92a5f1df..d3ee0b42169 100644 --- a/storage/innobase/include/buf0types.h +++ b/storage/innobase/include/buf0types.h @@ -191,10 +191,8 @@ class page_hash_latch : private rw_lock /** Acquire an exclusive lock */ inline void lock(); -#ifdef UNIV_DEBUG /** @return whether an exclusive lock is being held by any thread */ bool is_write_locked() const { return rw_lock::is_write_locked(); } -#endif /** @return whether any lock is being held by any thread */ bool is_locked() const { return rw_lock::is_locked(); }

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.