Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24109

InnoDB hangs with innodb_flush_sync=OFF

Details

    Description

      The sample MTR calls
      ./mysql-test-run.pl  --mysqld=--innodb_random_read_ahead=ON innodb.create-index
      ./mysql-test-run.pl  --mysqld=--innodb_flush_sync=0 innodb.create-index
      end up in
      Logging: ./mysql-test-run.pl  --mysqld=--innodb_flush_sync=0 innodb.create-index
      vardir: /home/mleich/Server_bin/10.5_asan_Og/mysql-test/var
      Checking leftover processes...
      Removing old var directory...
       - WARNING: Using the 'mysql-test/var' symlink
      Creating var directory '/home/mleich/Server_bin/10.5_asan_Og/mysql-test/var'...
      Checking supported features...
      MariaDB Version 10.5.7-MariaDB-debug
       - SSL connections supported
       - binaries are debug compiled
       - binaries built with wsrep patch
      Collecting tests...
      Installing system database...
       
      and than happens nothing extra == The process does not finish.
      origin/10.5 440d4b282dd4992d64abdd6289859598db7e5f75 2020-11-02
       
      origin/10.2 97f3207cf3d1a119fe3d6a56b204e5bf30cec109 2020-11-03   (10.2.35)
      origin/10.3 2391582ec3f87378581fc5a55266b3d7c6c823d6 2020-11-03 (10.3.26)
      origin/10.4 5739c7702d83e83ecff5cdd84e0fab899101f9f5 2020-11-03 (10.4.17)
      have no problem with --innodb_flush_sync=0 or innodb_random_read_ahead=ON
       
      IMHO the server process should either
      a) not freeze if meeting these two settings during bootstrap (my preference)
        ... ignore them or fix whatever ...
      or
      b) print an error message telling that these settings are not supported during
          bootstrap and abort.
      In case b) is picked and both options are important than some adjustment of MTR
      could be made.
      

      Attachments

        Issue Links

          Activity

            mleich, the stated 10.5 revision does not include the fix of MDEV-24101. Without that fix, I confirm that

            ./mysql-test-run.pl  --mysqld=--innodb_random_read_ahead=ON innodb.create-index
            

            will hang on bootstrap. I changed this report to cover innodb_flush_sync=OFF only, because innodb_random_read_ahead=ON was already covered by MDEV-24101.

            I confirm the hang with innodb_flush_sync=0 with the latest 10.5. During the early development of MDEV-23855, I did test the performance with innodb_flush_sync=OFF, and it did work back then. This must have been broken later.

            Curiously, for the following invocation, the parameter appears to be ignored for some reason:

            ./mtr --mysqld=--skip-innodb-flush-sync innodb.create-index
            

            This might be the reason why I failed to catch this regression later.

            I think that a proper fix is to let the page cleaner thread handle the checkpoint flushing also in the innodb_flush_sync=OFF case, and limit the write rate to innodb_io_capacity_max pages per second.

            With the default innodb_flush_sync=ON setting, we would attempt to write out up to innodb_io_capacity_max pages that were modified before the target checkpoint LSN, and then perform a checkpoint, and keep looping until the target has been met. With this bug fixed, innodb_flush_sync=OFF would do the same, except that we may pause between the batches so that the rate of innodb_io_capacity_max pages per second will not be exceeded.

            marko Marko Mäkelä added a comment - mleich , the stated 10.5 revision does not include the fix of MDEV-24101 . Without that fix, I confirm that ./mysql-test-run.pl --mysqld=--innodb_random_read_ahead=ON innodb.create-index will hang on bootstrap. I changed this report to cover innodb_flush_sync=OFF only, because innodb_random_read_ahead=ON was already covered by MDEV-24101 . I confirm the hang with innodb_flush_sync=0 with the latest 10.5. During the early development of MDEV-23855 , I did test the performance with innodb_flush_sync=OFF , and it did work back then. This must have been broken later. Curiously, for the following invocation, the parameter appears to be ignored for some reason: ./mtr --mysqld=--skip-innodb-flush-sync innodb.create-index This might be the reason why I failed to catch this regression later. I think that a proper fix is to let the page cleaner thread handle the checkpoint flushing also in the innodb_flush_sync=OFF case, and limit the write rate to innodb_io_capacity_max pages per second. With the default innodb_flush_sync=ON setting, we would attempt to write out up to innodb_io_capacity_max pages that were modified before the target checkpoint LSN, and then perform a checkpoint, and keep looping until the target has been met. With this bug fixed, innodb_flush_sync=OFF would do the same, except that we may pause between the batches so that the rate of innodb_io_capacity_max pages per second will not be exceeded.

            People

              marko Marko Mäkelä
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.