Details
-
Bug
-
Status: Open (View Workflow)
-
Minor
-
Resolution: Unresolved
-
10.11.4
-
None
-
Amazon AL2 I3EN.6XL
24 vCPUs, 192GB mem, 15000 GB NVMe SSD
Description
Hi, we are trying to migrate MariaDB from 10.4.10 to 10.11.4. We addressed most of the breaking changes specified in official documentation.
Below are cnf changes made to 10.11.4.
innodb_checksum_algorithm = CRC32 (previously INNODB)
innodb_adaptive_hash_index = 1 (previously defaulted to 1)
innodb_buffer_pool_instances = 1 (previously logical CPU count ~24)
Removed all deprecated variables innodb_autoextend_increment, innodb_thread_concurrency, innodb_thread_sleep_delay, innodb_log_compressed_pages, innodb_sync_array_size.
Regarding innodb_buffer_pool_instances and innodb_buffer_pool_size, innodb_buffer_pool_size value gets calculated based on the following logic.
- Number of chunks are 1024
- Pool size in byte is a multiple of Huge Page size (2MB)
- Pool size is equal or multiple of innodb_buffer_pool_chunk_size * innodb_buffer_pool_instances (https://dev.mysql.com/doc/refman/8.0/en/innodb-buffer-pool-resize.html)
Calculation logic:
- pool_size_bytes = available_physical_memory_bytes * 0.7
- chunk_size_bytes = (pool_size_bytes / 1024 no.of. chunks ) / 1024 / 1024/ 1024
- allocation_units = pool_size_bytes/ (chunck_size_bytes * pool_instances)
- buffer_pool_size_bytes = allocation_units * chunck_size_bytes * pool_instances
- buffer_pool_plus_buffer_size_bytes = buffer_pool_size_bytes * 1.1
- huge_page_count = buffer_pool_plus_buffer_size_bytes / huge page size bytes (2*1024*1024)
pool_instances reflects innodb_buffer_pool_instances in above calculation logic.
In 10.11, innodb_buffer_pool_instances variable is removed (MDEV-15058) we are defaulting pool_instance to 1. Please advise anything to be changed to above logic as our performance test results shows more slow queries and innodb_buffer_pool not efficiently performing.
I have also attached the cnf config for reference as well. mycnf.txt
Below are details on performance and high disk I/O observed during testing:
Executed TPCC like workload provided by Percona-labs on both host (10.4.10 and 10.11.4). Intention is to find any regression with the current cnf config used in 10.11.
Setup
Hardware configuration: I3EN.6XL, 24 vCPUs, 192GB mem, 15000 GB NVMe SSD
Threads: 1024 (thread execution ran for 1hr) with unlimited tps
Dataset: 100 tables with 1 data warehouse
Observation:
- We did see slow queries recorded in 10.11.4. slow_query config (long_query_time: 1 second)
attached screenshot - slow_queries_10_11.png - We did see there were more request in 10.4 to be read from innodb_buffer_pool compared to 10.11. attached screenshot - innodb_buffer_pool_read_requests.png
- Same with write request. attached screenshot - innodb_buffer_pool_write_requests.png
- Overall tps stats seems low for 10.11 compared to 10.4. screenshot - tps_stats.png
- I have also attached tpcc statistic summary from sysbench for both versions of mariadb (10_11_tpcc_stats.txt and 10_4_tpcc_stats.txt)
Appreciate if I get any suggestions to fine tune the performance in 10_11 and if any other metrics to be checked please guide on that.