Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.6.17, 10.11.7, 11.4.1
-
None
Description
This is similar to regressions I found with CPU-bound sysbench (MDEV-33966), but I opened a separate issue because this occurs with IO-bound Insert Benchmark.
I ran an IO-bound Insert Benchmark (IO-bound because the working set and database are much larger than memory) on a small server (8 cores, 16G RAM) to compare MariaDB LTS releases for 10.2, 10.3, 10.4, 10.5, 10.6, 10.11 and upcoming 11.4 with MySQL 5.6, 5.7 and 8.0.
A result for a CPU-bound benchmark is here which covers a test that uses a smaller database that can be cached and isn't IO-bound. In the cased case, MariaDB doesn't have large regressions from 10.2 to 11.4. Here, with an IO-bound setup there are regressions.
The IO-bound tests were run for 1 and 4 clients:
- all DBMS - 1 client and 4 clients
- all MariaDB LTS - 1 client and 4 clients
- all MySQL - 1 client and 4 clients
- MariaDB vs MySQL - 1 client and 4 clients
- MariaDB 10.11 - 1 client and 4 clients
- MariaDB 11.4 - 1 client and 4 clients
The way I label results context on the DBMS version and my.cnf
- ma101107_rel.cz11a_bee - MariaDB 10.11.7 with the cz11a_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=none
- ma101107_rel.cz11b_bee - MariaDB 10.11.7 with the cz11b_bee config that uses innodb_flush_method=O_DIRECT and innodb_change_buffering=none
- ma110401_rel.cz11b_bee - MariaDB 10.11.7 with the cz11b_bee config that uses innodb_flush_method=O_DIRECT and the InnoDB change buffer has been removed
- my8036_rel.cz11a_bee - MySQL 8.0.36 with the cz11a_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=all
- my8036_rel.cz11d_bee - MySQL 8.0.36 with the cz11d_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=none
Note that the cz11d_bee config for MySQL 8.0.36 is similar to the cz11a_bee config for MariaDB.
My claim about the regressions is based on the following
- start with the MariaDB vs MySQL comparison for 1 client and 4 clients and the results for MySQL 8.0.36 (my8036_rel.cz11a_bee and my8036_rel.cz11d_bee) are much better than for MariaDB 10.11.7 (ma101107_rel.cz11a_bee) and 11.4.1 (ma110401_rel.cz11b_bee).
- then look at results for MariaDB LTS releases with 1 client and 4 clients and see some regressions from ma100433 (10.4.33) to ma100524 (10.5.24) and larger regressions from ma100524 to ma100617 (10.6.17)
- then look at the HW metrics for MariaDB LTS releases from the 1 client setup. These are values from vmstat and iostat normalized by query and insert rates to understand HW efficiency. For the write heavy benchmark steps from 10.4.33 through 10.6.17 I see a ~20% increase in context switches per insert (cspq) and ~20% decrease in CPU per insert (cpupq) – see for l.i1 and for l.i2. Most of the change is from 10.5.24 to 10.6.17. I assume this is a result of the changes in 10.6 to replace some mutexes and rw-locks from spinning to not-spinning. So there is less CPU burned, but more lock waiters are going to sleep.
- the read-write benchmark steps also show a similar pattern as the write rate increases. See the 1 client results for range queries and point queries when the background write rate is 1000/s. Although here I see an increase in cspq (more context switches per query == more threads going to sleep) but not a large decrease in cpupq (CPU per query)
Attachments
Issue Links
- blocks
-
MDEV-34759 buf_page_get_low() is unnecessarily acquiring exclusive latch on secondary index pages
-
- Closed
-
- relates to
-
MDEV-34431 More fine grained control of spin loops could be useful
-
- Stalled
-
-
MDEV-34443 ha_innobase::info_low() does not distinguish HA_STATUS_VARIABLE_EXTRA
-
- Closed
-
-
MDEV-34458 wait_for_read in buf_page_get_low hurts performance
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Description |
This is similar to regressions I found with CPU-bound sysbench ([MDEV-33966|https://jira.mariadb.org/browse/MDEV-33966]), but I opened a separate issue because this occurs with IO-bound Insert Benchmark.
I ran an IO-bound Insert Benchmark (IO-bound because the working set and database are much larger than memory) on a small server (8 cores, 16G RAM) to compare MariaDB LTS releases for 10.2, 10.3, 10.4, 10.5, 10.6, 10.11 and upcoming 11.4 with MySQL 5.6, 5.7 and 8.0. A result for a CPU-bound benchmark [is here|https://smalldatum.blogspot.com/2024/05/the-insert-benchmark-mariadb-mysql-new.html] which covers a test that uses a smaller database that can be cached and isn't IO-bound. In the cased case, MariaDB doesn't have large regressions from 10.2 to 11.4. Here, with an IO-bound setup there are regressions. The IO-bound tests were run for 1 and 4 clients: * all DBMS - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.all/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.all/all.html#summary] * all MariaDB LTS - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.ma/all.html#summary] * all MySQL - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.my/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.my/all.html#summary] * MariaDB vs MySQL - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.mavsmy/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.mavsmy/all.html#summary] * MariaDB 10.11 - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma10/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.ma10/all.html#summary] * MariaDB 11.4 - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma11/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.ma11/all.html#summary] The way I label results context on the DBMS version and my.cnf * ma101107_rel.cz11a_bee - MariaDB 10.11.7 with the cz11a_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=none * ma101107_rel.cz11b_bee - MariaDB 10.11.7 with the cz11b_bee config that uses innodb_flush_method=O_DIRECT and innodb_change_buffering=none * ma110401_rel.cz11b_bee - MariaDB 10.11.7 with the cz11b_bee config that uses innodb_flush_method=O_DIRECT and the InnoDB change buffer has been removed * my8036_rel.cz11a_bee - MySQL 8.0.36 with the cz11a_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=all * my8036_rel.cz11d_bee - MySQL 8.0.36 with the cz11d_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=none Note that the cz11d_bee config for MySQL 8.0.36 is similar to the cz11a_bee config for MariaDB. My claim about the regressions is based on the following * start with the MariaDB vs MySQL comparison for [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.mavsmy/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.mavsmy/all.html#summary] and the results for MySQL 8.0.36 (my8036_rel.cz11a_bee and my8036_rel.cz11d_bee) are much better than for MariaDB 10.11.7 (ma101107_rel.cz11a_bee) and 11.4.1 (ma110401_rel.cz11b_bee). * then look at results for MariaDB LTS releases with [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.ma/all.html#summary] and see some regressions from ma100433 (10.4.33) to ma100524 (10.5.24) and larger regressions from ma100524 to ma100617 (10.6.17) * then look at the HW metrics for MariaDB LTS releases from the 1 client setup. These are values from vmstat and iostat normalized by query and insert rates to understand HW efficiency. For the write heavy benchmark steps from 10.4.33 through 10.6.17 I see a ~20% increase in context switches per insert (cspq) and ~20% decrease in CPU per insert (cpupq) for the write heavy benchmark steps -- see [for l.i1|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#l.i1.metrics] and [for l.i2|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#l.i2.metrics]. Most of the change is from 10.5.24 to 10.6.17. I assume this is a result of the changes in 10.6 to replace some mutexes and rw-locks from spinning to not-spinning. So there is less CPU burned, but more lock waiters are going to sleep. * the read-write benchmark steps also show a similar pattern as the write rate increases. See the 1 client results for [range queries|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#qr1000.L5.metrics] and [point queries|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#qp1000.L6.metrics] when the background write rate is 1000/s. Although here I see an increase in cspq (more context switches per query == more threads going to sleep) but not a large decrease in cpupq (CPU per query) |
This is similar to regressions I found with CPU-bound sysbench ([MDEV-33966|https://jira.mariadb.org/browse/MDEV-33966]), but I opened a separate issue because this occurs with IO-bound Insert Benchmark.
I ran an IO-bound Insert Benchmark (IO-bound because the working set and database are much larger than memory) on a small server (8 cores, 16G RAM) to compare MariaDB LTS releases for 10.2, 10.3, 10.4, 10.5, 10.6, 10.11 and upcoming 11.4 with MySQL 5.6, 5.7 and 8.0. A result for a CPU-bound benchmark [is here|https://smalldatum.blogspot.com/2024/05/the-insert-benchmark-mariadb-mysql-new.html] which covers a test that uses a smaller database that can be cached and isn't IO-bound. In the cased case, MariaDB doesn't have large regressions from 10.2 to 11.4. Here, with an IO-bound setup there are regressions. The IO-bound tests were run for 1 and 4 clients: * all DBMS - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.all/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.all/all.html#summary] * all MariaDB LTS - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.ma/all.html#summary] * all MySQL - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.my/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.my/all.html#summary] * MariaDB vs MySQL - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.mavsmy/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.mavsmy/all.html#summary] * MariaDB 10.11 - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma10/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.ma10/all.html#summary] * MariaDB 11.4 - [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma11/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.ma11/all.html#summary] The way I label results context on the DBMS version and my.cnf * ma101107_rel.cz11a_bee - MariaDB 10.11.7 with the cz11a_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=none * ma101107_rel.cz11b_bee - MariaDB 10.11.7 with the cz11b_bee config that uses innodb_flush_method=O_DIRECT and innodb_change_buffering=none * ma110401_rel.cz11b_bee - MariaDB 10.11.7 with the cz11b_bee config that uses innodb_flush_method=O_DIRECT and the InnoDB change buffer has been removed * my8036_rel.cz11a_bee - MySQL 8.0.36 with the cz11a_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=all * my8036_rel.cz11d_bee - MySQL 8.0.36 with the cz11d_bee config that uses innodb_flush_method=O_DIRECT_NO_FSYNC and innodb_change_buffering=none Note that the cz11d_bee config for MySQL 8.0.36 is similar to the cz11a_bee config for MariaDB. My claim about the regressions is based on the following * start with the MariaDB vs MySQL comparison for [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.mavsmy/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.mavsmy/all.html#summary] and the results for MySQL 8.0.36 (my8036_rel.cz11a_bee and my8036_rel.cz11d_bee) are much better than for MariaDB 10.11.7 (ma101107_rel.cz11a_bee) and 11.4.1 (ma110401_rel.cz11b_bee). * then look at results for MariaDB LTS releases with [1 client|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#summary] and [4 clients|https://mdcallag.github.io/reports/24_05_15.ib.4u.1tno.io.amd3.ma/all.html#summary] and see some regressions from ma100433 (10.4.33) to ma100524 (10.5.24) and larger regressions from ma100524 to ma100617 (10.6.17) * then look at the HW metrics for MariaDB LTS releases from the 1 client setup. These are values from vmstat and iostat normalized by query and insert rates to understand HW efficiency. For the write heavy benchmark steps from 10.4.33 through 10.6.17 I see a ~20% increase in context switches per insert (cspq) and ~20% decrease in CPU per insert (cpupq) -- see [for l.i1|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#l.i1.metrics] and [for l.i2|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#l.i2.metrics]. Most of the change is from 10.5.24 to 10.6.17. I assume this is a result of the changes in 10.6 to replace some mutexes and rw-locks from spinning to not-spinning. So there is less CPU burned, but more lock waiters are going to sleep. * the read-write benchmark steps also show a similar pattern as the write rate increases. See the 1 client results for [range queries|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#qr1000.L5.metrics] and [point queries|https://mdcallag.github.io/reports/24_05_15.ib.1u.1tno.io.amd3.ma/all.html#qp1000.L6.metrics] when the background write rate is 1000/s. Although here I see an increase in cspq (more context switches per query == more threads going to sleep) but not a large decrease in cpupq (CPU per query) |
Priority | Minor [ 4 ] | Critical [ 2 ] |
Assignee | Marko Mäkelä [ marko ] |
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.11 [ 27614 ] | |
Fix Version/s | 11.4 [ 29301 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Assignee | Marko Mäkelä [ marko ] | Debarun Banerjee [ JIRAUSER54513 ] |
Status | In Progress [ 3 ] | In Review [ 10002 ] |
Assignee | Debarun Banerjee [ JIRAUSER54513 ] | Marko Mäkelä [ marko ] |
Status | In Review [ 10002 ] | Stalled [ 10000 ] |
issue.field.resolutiondate | 2024-06-19 11:32:16.0 | 2024-06-19 11:32:15.851 |
Component/s | Storage Engine - InnoDB [ 10129 ] | |
Component/s | Server [ 13907 ] | |
Fix Version/s | 10.6.19 [ 29833 ] | |
Fix Version/s | 10.11.9 [ 29834 ] | |
Fix Version/s | 11.1.6 [ 29835 ] | |
Fix Version/s | 11.2.5 [ 29836 ] | |
Fix Version/s | 11.4.3 [ 29837 ] | |
Fix Version/s | 11.5.2 [ 29838 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Fix Version/s | 10.11 [ 27614 ] | |
Fix Version/s | 11.4 [ 29301 ] | |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Attachment | MDEV-34178_g1_g2.pdf [ 73680 ] |
Attachment | MDEV-34178.pdf [ 73682 ] |
Link | This issue relates to MDEV-34431 [ MDEV-34431 ] |
Link |
This issue relates to |
Attachment | image-2024-06-26-16-08-14-626.png [ 73701 ] |
Attachment | update_index_10.4.txt [ 73702 ] | |
Attachment | update_index_10.11.txt [ 73703 ] |
Link |
This issue relates to |
Assignee | Marko Mäkelä [ marko ] | Kirill Perov [ JIRAUSER51446 ] |
Assignee | Kirill Perov [ JIRAUSER51446 ] | Marko Mäkelä [ marko ] |
Attachment | test_output_sudo2.txt [ 73970 ] | |
Attachment | test_output1.txt [ 73971 ] |
Attachment | test_output3.txt [ 73974 ] |
Link |
This issue blocks |