Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33009

Server hangs for a long time with innodb_undo_log_truncate=ON

Details

    Description

      After implementing MDEV-32757, we are seeing a performance anomaly with innodb_undo_log_truncate=ON. The server is not actually hung or deadlocked (it will eventually recover), but buf_pool.mutex is being occupied for an extremely long time (several minutes).

      1. trx_purge_truncate_history() writes the message InnoDB: Truncating and is about to truncate an undo log tablespace.
      2. trx_purge_truncate_history() is busy-looping in a scan of buf_pool.flush_list because one of the pages belonging to the undo tablespace is write-fixed.
      3. During the time trx_purge_truncate_history() releases and re-acquires buf_pool.flush_list_mutex, buf_flush_page_cleaner (which is holding buf_pool.mutex in buf_do_flush_list_batch()) cannot grab it, in this Ubuntu 18.04 version of GNU libc and Linux kernel (4.15.0-112-generic). This could be similar to MDEV-31343 and MDEV-30180, which could only be reproduced in the same particular environment.
      4. Most threads are blocked because the buf_flush_page_cleaner thread is holding buf_pool.mutex.

      There is some indication that buf_flush_list_batch() may be making some progress (writing out some pages), but it would be extremely slow.

      Attachments

        Issue Links

          Activity

            axel Axel Schwenke added a comment -

            Here is a summary plot of the performance/behavior of various 10.5 and 10.6 commits for community server. 4 commits are shown:

            1. red line: last release (broken, corruption!)
            2. green line: HEAD of development
            3. blue line: HEAD of MDEV-33009 branch (less aggressive version)
            4. pink line: HEAD of MDEV-33009 branch (aggressive version)

            Attachment: 24x5_high_threads.pdf

            the tests with data set size 12x5 (12 thd) and data set size 24x5 (24 thd) did not make the undo logs grow and thus caused no truncate operation.

            It seems the pink line gives the best (but not good) result.

            axel Axel Schwenke added a comment - Here is a summary plot of the performance/behavior of various 10.5 and 10.6 commits for community server. 4 commits are shown: red line: last release (broken, corruption!) green line: HEAD of development blue line: HEAD of MDEV-33009 branch (less aggressive version) pink line: HEAD of MDEV-33009 branch (aggressive version) Attachment: 24x5_high_threads.pdf the tests with data set size 12x5 (12 thd) and data set size 24x5 (24 thd) did not make the undo logs grow and thus caused no truncate operation. It seems the pink line gives the best (but not good) result.

            Yes, this bug occurs during a scan of dirty pages that is enabled by setting innodb_undo_log_truncate=ON.

            marko Marko Mäkelä added a comment - Yes, this bug occurs during a scan of dirty pages that is enabled by setting innodb_undo_log_truncate=ON .
            axel Axel Schwenke added a comment - - edited

            commit 9682add5cdf "solves" the problem with performance, but effectively prevents undo log truncates during the benchmark. This can be clearly seen in the history list length here:

            When we stop sysbench for 20 seconds every 6 minutes, the purge thread and with it the undo log trucate can run:

            When we include 9682add5cdf we should add a KB comment that innodb_undo_log_truncate=ON will only have an effect when there are times when the server is not under stress.

            axel Axel Schwenke added a comment - - edited commit 9682add5cdf "solves" the problem with performance, but effectively prevents undo log truncates during the benchmark. This can be clearly seen in the history list length here: When we stop sysbench for 20 seconds every 6 minutes, the purge thread and with it the undo log trucate can run: When we include 9682add5cdf we should add a KB comment that innodb_undo_log_truncate=ON will only have an effect when there are times when the server is not under stress.

            axel, the commit that you mentioned only modifies some code that would be run after the InnoDB: Truncating message has been written to the server error log, to truncate an undo log tablespace. By design, the purge of history during a heavy workload is hard to predict. You might see some undo log truncation events during a workload if you let the workload run for a significantly longer time. In the bottom graph of 24x5_high_threads.pdf we have some results where a similar change (purple line) caused some stalls during the workload. Compared to that change, the revised commit would make trx_purge_truncate_history() acquire and release buf_pool.mutex also when a buffer page latch cannot be acquired without waiting. Some other testing suggested that this could significantly increase the probability of goto rescan and transform the loop from a linear scan into a quadratic scan. When there are millions or more pages in buf_pool.flush_list, this could be significant.

            Based on these results, I think that the setting innodb_undo_log_truncate=ON may only be useful when the concurrent write workload is light or moderate.

            marko Marko Mäkelä added a comment - axel , the commit that you mentioned only modifies some code that would be run after the InnoDB: Truncating message has been written to the server error log, to truncate an undo log tablespace. By design, the purge of history during a heavy workload is hard to predict. You might see some undo log truncation events during a workload if you let the workload run for a significantly longer time. In the bottom graph of 24x5_high_threads.pdf we have some results where a similar change (purple line) caused some stalls during the workload. Compared to that change, the revised commit would make trx_purge_truncate_history() acquire and release buf_pool.mutex also when a buffer page latch cannot be acquired without waiting. Some other testing suggested that this could significantly increase the probability of goto rescan and transform the loop from a linear scan into a quadratic scan. When there are millions or more pages in buf_pool.flush_list , this could be significant. Based on these results, I think that the setting innodb_undo_log_truncate=ON may only be useful when the concurrent write workload is light or moderate.

            I filed MDEV-33112 for an idea that we can employ a lazy approach and avoid the intrusive buffer pool scan altogether.

            marko Marko Mäkelä added a comment - I filed MDEV-33112 for an idea that we can employ a lazy approach and avoid the intrusive buffer pool scan altogether.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.