Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21534

improve locking/waiting in log_write_up_to

Details

    Description

      On multithreaded write-intensive benchmarks, with innodb_flush_log_at_trx_comit (e.g sysbench update_index), log_write_up_to() is one of the hottest functions, and the log_sys.mutex is one of the hottest mutexes.

      This is partially due to the way how log_sys.flush_event is used.
      Whenever there is a pending flush, a thread is going to wait on this event, and when event is signalled, it retries the flush from the start. The problem here is that when flush is done, all threads are woken, resulting in contention of log_sys.mutex. There is a lot of spurious wakeups, and retries.

      Situation can be improved e.g by using a custom synchronization primitive instead of Innodb event.

      The synchronization primitive could have 2 operations

      1. wait(wait_lsn)
      2. set(set_lsn)

      Where wait blocks current thread until set is called with set_lsn >= wait_lsn.
      Note, that seldom "pseudo-spurious" wakeup might still be necessary in order to elect new group commit leader. But set should not wakeup all of the waiting threads.

      Attachments

        Issue Links

          Activity

            wlad Vladislav Vaintroub created issue -
            wlad Vladislav Vaintroub made changes -
            Field Original Value New Value
            Issue Type Bug [ 1 ] Task [ 3 ]
            wlad Vladislav Vaintroub made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            axel Axel Schwenke made changes -
            Attachment MDEV-15058-10.5-34dafb7e3a8.ods [ 50308 ]
            axel Axel Schwenke made changes -
            Attachment MDEV-15058-10.5-34dafb7e3a8.ods [ 50308 ]
            axel Axel Schwenke made changes -
            Attachment pprepare.errlog.gz [ 50322 ]
            Attachment pprepare.stack.dump [ 50323 ]
            serg Sergei Golubchik made changes -
            wlad Vladislav Vaintroub made changes -
            issue.field.resolutiondate 2020-03-02 15:16:39.0 2020-03-02 15:16:39.458
            wlad Vladislav Vaintroub made changes -
            Fix Version/s 10.5.2 [ 24030 ]
            Fix Version/s 10.5 [ 23123 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            marko Marko Mäkelä made changes -
            serg Sergei Golubchik made changes -
            Description On multithreaded write-intensive benchmarks, with innodb_flush_log_at_trx_comit (e.g sysbench update_index), log_write_up_to() is one of the hottest functions, and the log_sys.mutex is one of the hottest mutexes.

            This is partially due to the way ho log_sys.flush_event is used.
            Whenever there is a pending flush, a thread is going to wait on this event, and when event is signalled, it retries the flush from the start. The problem here is that when flush is done, all threads are woken, resulting in contention of log_sys.mutex. There is a lot of spurious wakeups, and retries.

            Situation can be improved e.g by using a custom synchronization primitive instead of Innodb event.

            The synchronization primitive could have 2 operations
            # wait(wait_lsn)
            # set(set_lsn)

            Where _wait_ blocks current thread until _set_ is called with set_lsn >= wait_lsn.
            Note, that seldom "pseudo-spurious" wakeup might still be necessary in order to elect new group commit leader. But _set_ should not wakeup all of the waiting threads.
            On multithreaded write-intensive benchmarks, with innodb_flush_log_at_trx_comit (e.g sysbench update_index), log_write_up_to() is one of the hottest functions, and the log_sys.mutex is one of the hottest mutexes.

            This is partially due to the way how log_sys.flush_event is used.
            Whenever there is a pending flush, a thread is going to wait on this event, and when event is signalled, it retries the flush from the start. The problem here is that when flush is done, all threads are woken, resulting in contention of log_sys.mutex. There is a lot of spurious wakeups, and retries.

            Situation can be improved e.g by using a custom synchronization primitive instead of Innodb event.

            The synchronization primitive could have 2 operations
            # wait(wait_lsn)
            # set(set_lsn)

            Where _wait_ blocks current thread until _set_ is called with set_lsn >= wait_lsn.
            Note, that seldom "pseudo-spurious" wakeup might still be necessary in order to elect new group commit leader. But _set_ should not wakeup all of the waiting threads.
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 103173 ] MariaDB v4 [ 134171 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -

            People

              wlad Vladislav Vaintroub
              wlad Vladislav Vaintroub
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.