Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21534

improve locking/waiting in log_write_up_to

Details

    Description

      On multithreaded write-intensive benchmarks, with innodb_flush_log_at_trx_comit (e.g sysbench update_index), log_write_up_to() is one of the hottest functions, and the log_sys.mutex is one of the hottest mutexes.

      This is partially due to the way how log_sys.flush_event is used.
      Whenever there is a pending flush, a thread is going to wait on this event, and when event is signalled, it retries the flush from the start. The problem here is that when flush is done, all threads are woken, resulting in contention of log_sys.mutex. There is a lot of spurious wakeups, and retries.

      Situation can be improved e.g by using a custom synchronization primitive instead of Innodb event.

      The synchronization primitive could have 2 operations

      1. wait(wait_lsn)
      2. set(set_lsn)

      Where wait blocks current thread until set is called with set_lsn >= wait_lsn.
      Note, that seldom "pseudo-spurious" wakeup might still be necessary in order to elect new group commit leader. But set should not wakeup all of the waiting threads.

      Attachments

        1. pprepare.errlog.gz
          15 kB
          Axel Schwenke
        2. pprepare.stack.dump
          10 kB
          Axel Schwenke

        Issue Links

          Activity

            wlad Vladislav Vaintroub added a comment - - edited

            axel, can you describe "how to reproduce"`? Nobody has seen "odd behaviour", but you. If there is no way to reproduce, please share the all threads callstacks.

            wlad Vladislav Vaintroub added a comment - - edited axel , can you describe "how to reproduce"`? Nobody has seen "odd behaviour", but you. If there is no way to reproduce, please share the all threads callstacks.

            The repository contains several commits for the MDEV-21534
            4ce687d6d1f53be3b6928739d11febe9e1b936c7 fixes a potential deadlock, so it is necessary to know which commit was tested
            axel ^

            wlad Vladislav Vaintroub added a comment - The repository contains several commits for the MDEV-21534 4ce687d6d1f53be3b6928739d11febe9e1b936c7 fixes a potential deadlock, so it is necessary to know which commit was tested axel ^
            axel Axel Schwenke added a comment - - edited

            There is one easy way to cause a server hang: use parallel_prepare.lua (distributed with sysbench-mariadb) to load multiple tables in parallel. I just tried with 16 tables and it didn't even reach the "loading rows" stage.
            I pulled stack traces with PMP. I tried to shutdown the server and got the same warnings in the error log. This time I waited a bit longer and InnoDB started printing "long semaphore wait" messages. I attach both the stack dump and the error log.

            Files pprepare.stack.dump and pprepare.errlog.gz

            I remember that when I spotted this, I reconfigured the sysbench script to load tables sequentially, but then got a hang during a benchmark run. A cannot reproduce it right now, but keep trying.

            As for which commit it was - I cannot say. The build finished on Jan 21st 15:11 UTC and I pulled some minutes before. Wait, looks like I did not pull in this repo after that. It's at commit fcb5d008 now, so probably was at that when I built. And yes, that is before that deadlock fix.

            axel Axel Schwenke added a comment - - edited There is one easy way to cause a server hang: use parallel_prepare.lua (distributed with sysbench-mariadb) to load multiple tables in parallel. I just tried with 16 tables and it didn't even reach the "loading rows" stage. I pulled stack traces with PMP. I tried to shutdown the server and got the same warnings in the error log. This time I waited a bit longer and InnoDB started printing "long semaphore wait" messages. I attach both the stack dump and the error log. Files pprepare.stack.dump and pprepare.errlog.gz I remember that when I spotted this, I reconfigured the sysbench script to load tables sequentially, but then got a hang during a benchmark run. A cannot reproduce it right now, but keep trying. As for which commit it was - I cannot say. The build finished on Jan 21st 15:11 UTC and I pulled some minutes before. Wait, looks like I did not pull in this repo after that. It's at commit fcb5d008 now, so probably was at that when I built. And yes, that is before that deadlock fix.

            I think that a merge of (or rebase to) the latest 10.5 is needed for the development branch. Some hangs were introduced in 10.5 by MDEV-16678 and MDEV-16264, and I have not seen any hangs since this commit marked with MDEV-21551.

            marko Marko Mäkelä added a comment - I think that a merge of (or rebase to) the latest 10.5 is needed for the development branch. Some hangs were introduced in 10.5 by MDEV-16678 and MDEV-16264 , and I have not seen any hangs since this commit marked with MDEV-21551 .
            axel Axel Schwenke added a comment -

            OK, with commit 09fa2894d9e I did not see any anomalies from branch bb-10.5-MDEV-21534 anymore.

            axel Axel Schwenke added a comment - OK, with commit 09fa2894d9e I did not see any anomalies from branch bb-10.5- MDEV-21534 anymore.

            People

              wlad Vladislav Vaintroub
              wlad Vladislav Vaintroub
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.