This is similar to regressions I found with CPU-bound sysbench (MDEV-33966), but I opened a separate issue because this occurs with IO-bound Insert Benchmark.
I ran an IO-bound Insert Benchmark (IO-bound because the working set and database are much larger than memory) on a small server (8 cores, 16G RAM) to compare MariaDB LTS releases for 10.2, 10.3, 10.4, 10.5, 10.6, 10.11 and upcoming 11.4 with MySQL 5.6, 5.7 and 8.0.
A result for a CPU-bound benchmark is here which covers a test that uses a smaller database that can be cached and isn't IO-bound. In the cased case, MariaDB doesn't have large regressions from 10.2 to 11.4. Here, with an IO-bound setup there are regressions.
The IO-bound tests were run for 1 and 4 clients:
The way I label results context on the DBMS version and my.cnf
- ma101107_rel.cz11a_bee - MariaDB 10.11.7 with the cz11a_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=none
- ma101107_rel.cz11b_bee - MariaDB 10.11.7 with the cz11b_bee config that uses innodb_flush_method=O_DIRECT and innodb_change_buffering=none
- ma110401_rel.cz11b_bee - MariaDB 10.11.7 with the cz11b_bee config that uses innodb_flush_method=O_DIRECT and the InnoDB change buffer has been removed
- my8036_rel.cz11a_bee - MySQL 8.0.36 with the cz11a_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=all
- my8036_rel.cz11d_bee - MySQL 8.0.36 with the cz11d_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=none
Note that the cz11d_bee config for MySQL 8.0.36 is similar to the cz11a_bee config for MariaDB.
My claim about the regressions is based on the following
- start with the MariaDB vs MySQL comparison for 1 client and 4 clients and the results for MySQL 8.0.36 (my8036_rel.cz11a_bee and my8036_rel.cz11d_bee) are much better than for MariaDB 10.11.7 (ma101107_rel.cz11a_bee) and 11.4.1 (ma110401_rel.cz11b_bee).
- then look at results for MariaDB LTS releases with 1 client and 4 clients and see some regressions from ma100433 (10.4.33) to ma100524 (10.5.24) and larger regressions from ma100524 to ma100617 (10.6.17)
- then look at the HW metrics for MariaDB LTS releases from the 1 client setup. These are values from vmstat and iostat normalized by query and insert rates to understand HW efficiency. For the write heavy benchmark steps from 10.4.33 through 10.6.17 I see a ~20% increase in context switches per insert (cspq) and ~20% decrease in CPU per insert (cpupq) – see for l.i1 and for l.i2. Most of the change is from 10.5.24 to 10.6.17. I assume this is a result of the changes in 10.6 to replace some mutexes and rw-locks from spinning to not-spinning. So there is less CPU burned, but more lock waiters are going to sleep.
- the read-write benchmark steps also show a similar pattern as the write rate increases. See the 1 client results for range queries and point queries when the background write rate is 1000/s. Although here I see an increase in cspq (more context switches per query == more threads going to sleep) but not a large decrease in cpupq (CPU per query)