I extended the tests that I conducted yesterday on 10.10, running them for
each value of innodb_flush_trx_at_commit and all 3 storage devices that I have easy access to.
Like yesterday, I tested both the impact of disabling O_DIRECT for the redo log (reverting MDEV-28111) and enabling asynchronous redo log (applying MDEV-26603).
INTEL SSDPED1D960GAY (Intel Optane 960 NVMe PCIe, 1 TiB):
variant |
20 |
40 |
80 |
160 |
320 |
640 |
-26603 -28111 0 |
174877.25 |
218600.14 |
243620.69 |
235452.35 |
221782.77 |
230218.65 |
-26603 +28111 0 |
177523.85 |
220885.63 |
250248.56 |
222407.88 |
228020.70 |
231261.60 |
-26603 -28111 1 |
40154.01 |
83822.27 |
146941.02 |
169106.35 |
159355.62 |
148137.08 |
-26603 +28111 1 |
44393.10 |
87522.12 |
154563.69 |
175664.35 |
149363.81 |
140018.75 |
-26603 -28111 2 |
116627.95 |
110871.80 |
93141.31 |
90384.32 |
95568.63 |
100567.27 |
-26603 +28111 2 |
77489.69 |
126257.11 |
170294.80 |
153718.74 |
146924.24 |
139760.85 |
+26603 -28111 0 |
175585.31 |
219198.76 |
246444.31 |
234371.68 |
223168.61 |
230352.08 |
+26603 +28111 0 |
175449.09 |
218696.74 |
245015.51 |
232257.35 |
223078.96 |
231192.25 |
+26603 -28111 1 |
38898.41 |
77419.98 |
143587.57 |
171343.71 |
160899.60 |
147169.91 |
+26603 +28111 1 |
39056.53 |
80030.78 |
152584.63 |
178147.98 |
149525.98 |
141598.37 |
+26603 -28111 2 |
116206.19 |
122350.90 |
102117.73 |
94406.60 |
106100.66 |
112945.75 |
+26603 +28111 2 |
61655.26 |
119870.79 |
170313.32 |
155449.44 |
149877.05 |
141699.37 |
Samsung SSD 850 EVO 500GB, SATA 3.1, 6.0 Gb/s (SSD):
variant |
20 |
40 |
80 |
160 |
320 |
640 |
-26603 -28111 0 |
173837.73 |
196463.82 |
181172.24 |
248655.80 |
121209.19 |
185363.02 |
-26603 +28111 0 |
173456.74 |
227876.18 |
140814.39 |
247316.21 |
147288.87 |
195739.68 |
-26603 -28111 1 |
1637.31 |
3293.68 |
6268.56 |
28426.46 |
29659.73 |
37819.90 |
-26603 +28111 1 |
1651.81 |
3117.78 |
5865.57 |
11192.76 |
21408.23 |
42071.74 |
-26603 -28111 2 |
116561.43 |
111697.04 |
92891.91 |
77973.84 |
83535.76 |
102135.74 |
-26603 +28111 2 |
50282.03 |
134429.60 |
115351.79 |
120100.46 |
151187.50 |
145336.60 |
+26603 -28111 0 |
172827.09 |
195303.67 |
175215.87 |
247712.12 |
112698.70 |
181610.86 |
+26603 +28111 0 |
172311.93 |
196527.93 |
187143.44 |
249095.29 |
117095.25 |
185992.18 |
+26603 -28111 1 |
4859.92 |
2997.25 |
5666.87 |
10431.58 |
19529.83 |
37454.78 |
+26603 +28111 1 |
1581.06 |
3132.87 |
5732.18 |
10837.41 |
20893.13 |
41192.27 |
+26603 -28111 2 |
113534.45 |
120130.84 |
101364.19 |
77589.95 |
90102.00 |
113379.56 |
+26603 +28111 2 |
45810.35 |
128985.45 |
139551.91 |
103551.01 |
153063.98 |
144727.32 |
Western Digital Blue WDC WD20EZRZ-00Z5HB0, SATA 3.0, 6.0 Gb/s (2TiB HDD):
variant |
20 |
40 |
80 |
160 |
320 |
640 |
-26603 -28111 0 |
151030.35 |
156442.72 |
59875.14 |
40854.54 |
32574.75 |
66260.11 |
-26603 +28111 0 |
152691.94 |
198770.81 |
45308.39 |
36743.35 |
38068.41 |
38420.25 |
-26603 -28111 1 |
442.81 |
1029.68 |
1922.92 |
3391.98 |
5221.32 |
8323.79 |
-26603 +28111 1 |
483.88 |
1027.51 |
1845.69 |
3305.48 |
5473.87 |
8721.01 |
-26603 -28111 2 |
107780.99 |
104655.24 |
86786.08 |
86942.45 |
14004.30 |
37662.52 |
-26603 +28111 2 |
38057.16 |
81649.51 |
124179.58 |
66601.85 |
4696.54 |
8492.77 |
+26603 -28111 0 |
147208.47 |
184330.29 |
48711.60 |
46878.98 |
44284.67 |
45194.95 |
+26603 +28111 0 |
153007.97 |
197770.54 |
36163.94 |
35519.65 |
42716.21 |
30158.95 |
+26603 -28111 1 |
463.67 |
935.19 |
1665.83 |
3219.84 |
5437.44 |
8793.70 |
+26603 +28111 1 |
457.01 |
935.99 |
1551.90 |
1922.09 |
3504.11 |
8506.51 |
+26603 -28111 2 |
107788.64 |
115177.97 |
91923.53 |
49215.16 |
42875.74 |
38165.49 |
+26603 +28111 2 |
33210.59 |
80173.39 |
127686.29 |
62756.50 |
4504.81 |
8561.01 |
There occasionally occur some log checkpoints (page writes) that will affect the throughput. This is particularly visible for the HDD benchmark, where the throughput drops to 0 for 20 seconds of the 30-second test at 80 concurrent connections, and never recovers from that average in the following 90 seconds (testing with 160, 320, and 640 concurrent connections). I guess that this happened because I was running the workload back-to-back, without triggering any log checkpoint or waiting for the history of committed transactions to be purged.
We can observe that on the slower storage (or on the NVMe at low numbers of concurrent connections), enabling O_DIRECT on the redo log does seem to help in this simple benchmark where all data fits in the buffer pool, and most writes are for the redo log. We can observe this even for the safe setting innodb_flush_log_at_trx_commit=1 in some cases.
However, we should also keep in mind that not using O_DIRECT will pollute the file system cache with useless data.
Because my results do not show any clear pattern, I think that we must introduce a settable Boolean configuration parameter, for enabling or disabling the file system cache for the redo log. I maintain that the caching should be disabled by default. Also mariadb-backup should benefit from enabling the file system cache when it is running.
Uploaded: