Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
Description
I'm evaluating MariaDB 10.5.9 performance against 10.5.5 as was described in https://www.percona.com/blog/2020/08/14/evaluating-performance-improvements-in-mariadb-10-5-5/ . Also I tried current 10.6 (as of 2ad61c678243dec2a808b1f67b5760a725163052) and noticed significant performance degradation against 10.5.9 on in-memory workload (see mariadb_all.png).
Data size produced with sysbench-tpcc is 104G and buffer pool is 116G, so this is pure in-memory workload (all buffer_LRU_batch* metrics are zero). The testbed machine has slow NVMe drive (26 KIOPS for random writes and 320 KIOPS for random reads), 128GB RAM and 40 hyperthreads.
While the original tests are 3 hours, 10.6 shows much slower performance right at the begin of the test, so I reproduced the problem on 10 minutes workloads. As tpcc_mariadb-10.6_116bp_600.stat and tpcc_mariadb-10.5.9_116bp_600.stat sysbench statistic reports there are 3845 TPS for 10.6 and 4965 TPS for 10.5.9.
10.6 and 10.5.9 have quite different I/O and CPU profiles shown in the .dstat files. Since this is in-memory workload, I suspected either doublewrite buffer or checkpointing. I reran the tests with doublewrite buffer switched off (_nodblwr.stat files): 10.6 is still slower than 10.5.9, but 10.6 is much more affected by switching off doublewrite buffer, than 10.5.9. So there is some problem with double write, but there is definitely something more.
I collected some InnoDB LRU, flushing, checkpointing, and log metrics with 10 second intervals (*.metrics files), but I didn't find any significant differences except log_lsn_checkpoint_age (see checkpoint.png). There were more or less the same number of checkpoints, page flushes, and log writes.
10.6 was built with io_uring, but I checked with gbd, that there were no aio_uring::submit_io() calls.
Attachments
Issue Links
- relates to
-
MDEV-25404 read-only performance regression in 10.6
- Closed