Details

    Description

      lock_sys is one of three major InnoDB scalability bottlenecks. Scalability issues are especially obvious under sysbench OLTP update index/non-index benchmarks.

      There's no clarity on how exactly it should be optimised yet.

      Attachments

        Issue Links

          Activity

            Also srv_slot_t can be removed and the locality of reference improved by storing trx->lock.wait_lock and trx->lock.cond in adjacent addresses.

            marko Marko Mäkelä added a comment - Also srv_slot_t can be removed and the locality of reference improved by storing trx->lock.wait_lock and trx->lock.cond in adjacent addresses.
            marko Marko Mäkelä added a comment - - edited

            zhaiwx1987, I adapted the MDEV-11392 idea from MySQL Bug #72948, but I introduced a single counter dict_table_t::n_lock_x_or_s. There is actually quite a bit of room for improvement in lock_sys, in addition to what was done in MySQL 8.0.21 WL#10314.

            marko Marko Mäkelä added a comment - - edited zhaiwx1987 , I adapted the MDEV-11392 idea from MySQL Bug #72948 , but I introduced a single counter dict_table_t::n_lock_x_or_s . There is actually quite a bit of room for improvement in lock_sys , in addition to what was done in MySQL 8.0.21 WL#10314 .
            marko Marko Mäkelä added a comment - - edited

            The lock_wait() refactoring was causing some assertion failures in the start/stop que_thr_t bookkeeping. I think that it is simplest to remove that bookkeeping along with removing some unnecessary data members or enum values. Edit: This was done in MDEV-24671. As an added bonus, innodb_lock_wait_timeout is enforced more timely (no extra 1-second delay).

            It turns out that the partitioned lock_sys.mutex will not work efficiently with the old DeadlockChecker. It must be refactored, similar to what was done in Oracle Bug #29882690 in MySQL 8.0.18.

            marko Marko Mäkelä added a comment - - edited The lock_wait() refactoring was causing some assertion failures in the start/stop que_thr_t bookkeeping. I think that it is simplest to remove that bookkeeping along with removing some unnecessary data members or enum values. Edit: This was done in MDEV-24671 . As an added bonus, innodb_lock_wait_timeout is enforced more timely (no extra 1-second delay). It turns out that the partitioned lock_sys.mutex will not work efficiently with the old DeadlockChecker . It must be refactored, similar to what was done in Oracle Bug #29882690 in MySQL 8.0.18.

            As a minimal change, I moved the DeadlockChecker::search() invocation to lock_wait(). A separate deadlock checker thread or task might still be useful. For that, I do not think that there is a need to introduce any blocking_trx data member. In our code, it should be safe to follow the chain of trx->lock.wait_lock->trx while holding lock_sys.wait_mutex and possibly also trx->mutex.

            marko Marko Mäkelä added a comment - As a minimal change, I moved the DeadlockChecker::search() invocation to lock_wait() . A separate deadlock checker thread or task might still be useful. For that, I do not think that there is a need to introduce any blocking_trx data member. In our code, it should be safe to follow the chain of trx->lock.wait_lock->trx while holding lock_sys.wait_mutex and possibly also trx->mutex .

            We replaced lock_sys.mutex with a lock_sys.latch (MDEV-24167) that is 4 or 8 bytes on Linux, Microsoft Windows or OpenBSD. On other systems, a native rw-lock or a mutex and two condition variables will be used.

            The entire world of transactional locks can be stopped by acquiring lock_sys.latch in exclusive mode.

            Scalability is achieved by making most users use a combination of a shared lock_sys.latch and a lock-specific dict_table_t::lock_mutex or lock_sys_t::hash_latch that is embedded in each cache line of the lock_sys.rec_hash, lock_sys.prdt_hash, or lock_sys.prdt_page_hash. The lock_sys_t::hash_latch is always 4 or 8 bytes. On other systems than Linux, OpenBSD, and Microsoft Windows, the lock_sys_t::hash_latch::release() will always acquire a mutex and signal a condition variable. This is a known scalability bottleneck and could be improved further on such systems, by splitting the mutex and condition variable. (If such systems supported a lightweight mutex that is at most sizeof(void*), then we could happily use that.)

            Until MDEV-24738 has been fixed, the deadlock detector will remain a significant bottleneck, because each lock_wait() would acquire lock_sys.latch in exclusive mode. This bottleneck can be avoided by setting innodb_deadlock_detect=OFF.

            marko Marko Mäkelä added a comment - We replaced lock_sys.mutex with a lock_sys.latch ( MDEV-24167 ) that is 4 or 8 bytes on Linux, Microsoft Windows or OpenBSD. On other systems, a native rw-lock or a mutex and two condition variables will be used. The entire world of transactional locks can be stopped by acquiring lock_sys.latch in exclusive mode. Scalability is achieved by making most users use a combination of a shared lock_sys.latch and a lock-specific dict_table_t::lock_mutex or lock_sys_t::hash_latch that is embedded in each cache line of the lock_sys.rec_hash , lock_sys.prdt_hash , or lock_sys.prdt_page_hash . The lock_sys_t::hash_latch is always 4 or 8 bytes. On other systems than Linux, OpenBSD, and Microsoft Windows, the lock_sys_t::hash_latch::release() will always acquire a mutex and signal a condition variable. This is a known scalability bottleneck and could be improved further on such systems, by splitting the mutex and condition variable. (If such systems supported a lightweight mutex that is at most sizeof(void*) , then we could happily use that.) Until MDEV-24738 has been fixed, the deadlock detector will remain a significant bottleneck, because each lock_wait() would acquire lock_sys.latch in exclusive mode. This bottleneck can be avoided by setting innodb_deadlock_detect=OFF .

            People

              marko Marko Mäkelä
              svoj Sergey Vojtovich
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.