I conducted some more tests, also comparing innodb_flush_method=O_DSYNC to this revised innodb_flush_method=O_DIRECT in a Sysbench oltp_update_non_index workload that almost completely avoids log checkpoints, and concentrates on innodb_flush_log_at_trx_commit=1 latency on transaction commits.
On the 3 devices I tested with the Linux 5.16.14 kernel, ext4 file system and io_uring, O_DSYNC was slightly faster on an NVMe drive as well as a SATA SSD (both with 512-byte physical block size), and slightly slower on a SATA 3.0 HDD (with 4096-byte physical block size). None of the devices support FUA mode, which according to https://lwn.net/Articles/400541/ implies that each O_DSYNC write may have to execute both a write and a flush inside the kernel. Apparently, for the solid-state drives that I tested it ends up consuming less time than the savings from the skipped fdatasync() system calls.
On the SATA SSD, less than a minute into the benchmark, the write speed was reduced to about a third. I assume that the drive had run out of flash erase blocks (the volume was rather full) and it had to start throttling writes. This occured at about the same time both with and without O_DSYNC.
Apparently, until some time before 2010, Linux could have wrongly returned from an O_DSYNC write already when the data had been written to the drive cache, that is, the flush may have been skipped: https://linux-scsi.vger.kernel.narkive.com/yNnBRBPn/o-direct-and-barriers
Before
MDEV-14425, writes to the InnoDB redo log ib_logfile0 were always buffered on Linux.In
MDEV-14425, the setting innodb_flush_method=O_DSYNC enabled O_DIRECT on InnoDB log and data files.With this change, we will also enable O_DIRECT on the InnoDB log for the following settings:
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_flush_method=O_DIRECT (the default setting since
MDEV-24854).