Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.8.2
-
Linux
Description
In MDEV-14425, we experimented with enabling O_DIRECT when writing to the InnoDB redo log. Previously, this was only done on Microsoft Windows when the physical block size was detected to be 512 bytes. On Linux, we ended up allowing O_DIRECT on the redo log only if innodb_flush_method=O_DSYNC is specified. The reason for this was that the throughput was slightly better on two systems when O_DIRECT was disabled.
On other systems than Linux or Microsoft Windows, we do not enable O_DIRECT on the redo log, because we are not aware of interfaces that would allow the physical block size to be determined. We do not want to write 4096-byte redo log blocks "just in case"; devices with 512-byte physical block size are still common.
The throughput was greatly improved by implementing MDEV-27774, to allow multiple threads to concurrently write to log_sys.buf. I tested the performance again, and now my simple benchmark with proper durability (innodb_flush_log_at_trx_commit=1) on NVMe using ext4fs and io_uring on Linux kernel 5.16.12 shows a clear improvement when enabling O_DIRECT for the redo log by default (innodb_flush_method=O_DIRECT).
Here are the average throughput and 95 percentile latency for a 180-second benchmark show:
revision | throughput/tps | latency/ms |
---|---|---|
10.8 86820837cb34dea54b3221a278a96b667743c11f | 58431.08 | 0.80 |
patched | 63742.95 | 0.67 |
For a 30-second benchmark, the impact on latency was slightly more prominent:
revision | throughput/tps | latency/ms |
---|---|---|
10.8 86820837cb34dea54b3221a278a96b667743c11f | 57255.49 | 0.81 |
patched | 63331.22 | 0.65 |
Attachments
Issue Links
- causes
-
MDEV-28766 MDEV-28111 breaks innodb_flush_log_at_trx_commit=2
- Closed
- relates to
-
MDEV-14425 Change the InnoDB redo log format to reduce write amplification
- Closed
-
MDEV-27774 Reduce scalability bottlenecks in mtr_t::commit()
- Closed