[MDEV-24109] InnoDB hangs with innodb_flush_sync=OFF Created: 2020-11-03  Updated: 2020-12-04  Resolved: 2020-11-04

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5.7
Fix Version/s: 10.5.9

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: bootstrap, not-10.2, not-10.3, not-10.4

Issue Links:
Problem/Incident
is caused by MDEV-23855 InnoDB log checkpointing causes regre... Closed
Relates
relates to MDEV-24348 InnoDB shutdown hang with innodb_flus... Closed

 Description   

The sample MTR calls
./mysql-test-run.pl  --mysqld=--innodb_random_read_ahead=ON innodb.create-index
./mysql-test-run.pl  --mysqld=--innodb_flush_sync=0 innodb.create-index
end up in
Logging: ./mysql-test-run.pl  --mysqld=--innodb_flush_sync=0 innodb.create-index
vardir: /home/mleich/Server_bin/10.5_asan_Og/mysql-test/var
Checking leftover processes...
Removing old var directory...
 - WARNING: Using the 'mysql-test/var' symlink
Creating var directory '/home/mleich/Server_bin/10.5_asan_Og/mysql-test/var'...
Checking supported features...
MariaDB Version 10.5.7-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
 - binaries built with wsrep patch
Collecting tests...
Installing system database...
 
and than happens nothing extra == The process does not finish.
origin/10.5 440d4b282dd4992d64abdd6289859598db7e5f75 2020-11-02
 
origin/10.2 97f3207cf3d1a119fe3d6a56b204e5bf30cec109 2020-11-03   (10.2.35)
origin/10.3 2391582ec3f87378581fc5a55266b3d7c6c823d6 2020-11-03 (10.3.26)
origin/10.4 5739c7702d83e83ecff5cdd84e0fab899101f9f5 2020-11-03 (10.4.17)
have no problem with --innodb_flush_sync=0 or innodb_random_read_ahead=ON
 
IMHO the server process should either
a) not freeze if meeting these two settings during bootstrap (my preference)
  ... ignore them or fix whatever ...
or
b) print an error message telling that these settings are not supported during
    bootstrap and abort.
In case b) is picked and both options are important than some adjustment of MTR
could be made.



 Comments   
Comment by Marko Mäkelä [ 2020-11-04 ]

mleich, the stated 10.5 revision does not include the fix of MDEV-24101. Without that fix, I confirm that

./mysql-test-run.pl  --mysqld=--innodb_random_read_ahead=ON innodb.create-index

will hang on bootstrap. I changed this report to cover innodb_flush_sync=OFF only, because innodb_random_read_ahead=ON was already covered by MDEV-24101.

I confirm the hang with innodb_flush_sync=0 with the latest 10.5. During the early development of MDEV-23855, I did test the performance with innodb_flush_sync=OFF, and it did work back then. This must have been broken later.

Curiously, for the following invocation, the parameter appears to be ignored for some reason:

./mtr --mysqld=--skip-innodb-flush-sync innodb.create-index

This might be the reason why I failed to catch this regression later.

I think that a proper fix is to let the page cleaner thread handle the checkpoint flushing also in the innodb_flush_sync=OFF case, and limit the write rate to innodb_io_capacity_max pages per second.

With the default innodb_flush_sync=ON setting, we would attempt to write out up to innodb_io_capacity_max pages that were modified before the target checkpoint LSN, and then perform a checkpoint, and keep looping until the target has been met. With this bug fixed, innodb_flush_sync=OFF would do the same, except that we may pause between the batches so that the rate of innodb_io_capacity_max pages per second will not be exceeded.

Generated at Thu Feb 08 09:27:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.