[MDEV-30054] debug-no-sync doesnt fully disable sync calls Created: 2022-11-21 Updated: 2023-01-30 Resolved: 2023-01-30 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | None |
| Fix Version/s: | 11.0.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Andrii | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | beginner-friendly | ||
| Environment: |
should be reproducible on any linux where sync is really flushes changes onto disk (i.e. not faked). |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
According to documentation debug-no-sync Disables system sync calls But it doesn't disable the calls from all places, which affects benchmarks and timing of testing where disk sync must be excluded from scope. There are two ways to prove it.
2. Capturing stack traces e.g. during mysql_install_db shows hanging calls to fdatasync(). terminal1 (will show stack traces):
terminal2 (run server, e.g. mysql_install_db):
see the attached logs for details of stack traces. |
| Comments |
| Comment by Marko Mäkelä [ 2022-11-21 ] |
|
It looks like both fdatasync1.log If you want to reduce the amount of fsync() or fdatasync() operations inside InnoDB, the crash-safe way to do that is to change innodb_flush_log_at_trx_commit to the value 0 or 2. The default value (1) will ensure that each user transaction commit is durable. Even if you are fine with losing a few last transactions, crash recovery would be totally broken if there was no fdatasync() executed as part of InnoDB log checkpoints. There is no other way to force a certain ordering of writes (write barriers). It would be nice to have a system call interface for that. Starting with Last, if you do not care about data integrity at all (for example, when bulk loading data), you should be able to disable all fsync() or fdatasync() calls by using libeatmydata.so. |
| Comment by Andrii [ 2022-11-21 ] |
|
I don't think that debug symbols are relevant to this issue: the point wasn't to show exact places, the point was to demonstrate the problem. I don't need a workaround here - I just point out that documentation doesn't match behavior of server. And since Server can participate in complex scenarios - I think it is not fair to ask users to use external tools or play with multiple options when documentation claims that the behavior can be achieved with single parameter. |
| Comment by Marko Mäkelä [ 2022-11-28 ] |
|
As far as I can tell, the function my_sync() will call fdatasync() or fsync() or similar functions. On Microsoft Windows, it never tries to call the fdatasync() equivalent NtFlushBuffersFileEx(), but always the more expensive FlushFileBuffers(). It looks like some or all of the InnoDB os_file_flush_func() should be merged with my_sync(). |
| Comment by Andrii [ 2022-11-28 ] |
|
> It looks like some or all of the InnoDB os_file_flush_func() should be merged with my_sync(). Yes, that was my understanding as well, just it should also be done for the other storage engines (in particular at least sync from Aria influences timing of mysql_install_db). |
| Comment by Marko Mäkelä [ 2023-01-30 ] |
|
anikitin1, does MariaDB 11.0 (after |
| Comment by Andrii [ 2023-01-30 ] |
|
I've tried 11.0.0 tar and indeed the problem is fixed for described steps, thank you! |
| Comment by Marko Mäkelä [ 2023-01-30 ] |
|
anikitin1, thank you. This change was part of the 11.0.0 preview, and it was also applied to the 11.0.1 release separately. |
| Comment by Andrii [ 2023-01-30 ] |
|
On second thought I am not sure if I like the idea to obsolete O_DIRECT_NO_FSYNC (or did I get it wrong?). |
| Comment by Marko Mäkelä [ 2023-01-30 ] |
|
anikitin1, you are correct about the degree of danger. I believe that when using the ext4 file system on Linux, innodb_flush_method=O_DIRECT_NO_FSYNC is almost equivalent to innodb_flush_method=O_DIRECT. In our performance tests, we did not notice significant difference between them. Let me quote part of my comment from
The risky scenario (assuming Linux ext4 file system) would be that an InnoDB data file was extended by fallocate(), a log checkpoint was executed, and the operating system crashed and was restarted. In this case, we could fail to recover some newly extended pages in the file. I do not think that almost immeasurable performance gain of using innodb_flush_method=O_DIRECT_NO_FSYNC instead of innodb_flush_method=O_DIRECT is worth the trouble. Therefore, I do not think that losing innodb_flush_method=O_DIRECT_NO_FSYNC in |