Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.2.41, 10.3.32, 10.4.22, 10.5.13, 10.6.5
-
None
-
CentOS 7
COLO- 2x 8C/16T Intel Xeon Gold 6134 w/592GB RAM and 10TB PCI-E NVMe SSD
OTHER- 2x AMD Epyc 7532 w/512GB RAM and PCI-E NVMe SSD
Description
Original issue and repro-
https://bugs.mysql.com/bug.php?id=68079
Current repro and data provided in developer comment via link.
Flamegraphs from Intel machine attached (two variants per- one uses perf polling at 99Hz, other uses mariadb-stacktrace collecting thread dumps every 5 seconds for 60 iterations). Stack trace from AMD machine attached.
Three sets of flamegraphs-
- Regular 10.6.5 using utf8mb3 (not set as default) charset for tables with AHI enabled
- Regular 10.6.5 using latin1 (set as default) charset for tables with AHI disabled (default)
- Custom 10.6.5 using utf8mb3 (not set as default) charset for tables with AHI disabled (default) where line 867 of mysys/charset.c has been removed prior to compiling MariaDB 10.6.5 CS
Base concern is despite this being a read workload, adding cores to the workload has a diminishing impact (very noticeable going from 4 to 8 cores on our Intel setup where 8 cores should have been ideal for scaling)-
Benchmark
|
Average number of seconds to run all queries: 11.982 seconds
|
Minimum number of seconds to run all queries: 11.878 seconds
|
Maximum number of seconds to run all queries: 12.231 seconds
|
Number of clients running queries: 1
|
Average number of queries per client: 1000
|
|
Benchmark
|
Average number of seconds to run all queries: 8.712 seconds
|
Minimum number of seconds to run all queries: 8.347 seconds
|
Maximum number of seconds to run all queries: 9.358 seconds
|
Number of clients running queries: 4
|
Average number of queries per client: 250
|
|
Benchmark
|
Average number of seconds to run all queries: 7.104 seconds
|
Minimum number of seconds to run all queries: 6.999 seconds
|
Maximum number of seconds to run all queries: 7.216 seconds
|
Number of clients running queries: 8
|
Average number of queries per client: 125
|
|
Benchmark
|
Average number of seconds to run all queries: 5.572 seconds
|
Minimum number of seconds to run all queries: 5.464 seconds
|
Maximum number of seconds to run all queries: 5.695 seconds
|
Number of clients running queries: 16
|
Average number of queries per client: 62
|
Flamegraphs seem to demonstrate counter from MDEV-6274 yielding an outsized, negative impact. Results without this counter greatly improve-
Benchmark
|
Average number of seconds to run all queries: 11.891 seconds
|
Minimum number of seconds to run all queries: 11.738 seconds
|
Maximum number of seconds to run all queries: 12.050 seconds
|
Number of clients running queries: 1
|
Average number of queries per client: 1000
|
|
Benchmark
|
Average number of seconds to run all queries: 5.934 seconds
|
Minimum number of seconds to run all queries: 5.067 seconds
|
Maximum number of seconds to run all queries: 6.386 seconds
|
Number of clients running queries: 4
|
Average number of queries per client: 250
|
|
Benchmark
|
Average number of seconds to run all queries: 3.806 seconds
|
Minimum number of seconds to run all queries: 3.673 seconds
|
Maximum number of seconds to run all queries: 3.922 seconds
|
Number of clients running queries: 8
|
Average number of queries per client: 125
|
|
Benchmark
|
Average number of seconds to run all queries: 2.762 seconds
|
Minimum number of seconds to run all queries: 2.729 seconds
|
Maximum number of seconds to run all queries: 2.824 seconds
|
Number of clients running queries: 16
|
Average number of queries per client: 62
|
Flamegraphs from modified binary still suggest additional collation/charset optimization may stand in the way of even better performance, and likewise for what appear to be some low-level InnoDB locks.
Would like to see MariaDB achieve scaling closer to 200% performance gain for doubling core count, which on this same workload Postgres comes very close to achieving.
Attachments
Issue Links
- relates to
-
MDEV-10476 MariaDB 10.x.y does not include fix for upstream bug #68079
- Closed