Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
10.5.13
-
None
-
Linux 5.10.91-gentoo #1 SMP x86_64 AMD EPYC 7451 AuthenticAMD GNU/Linux
Dedicated raid1 nvme0n1p1[0] nvme1n1p1[2] for /var/lib/mysql directory
256G RAM
galera (26.4.10) cluster configured, but only one node was started after upgrade.
Description
Attempt to upgrade from 10.4.22 to 10.5.13 turned into huge performance issues and ended with rolling the upgrade back with restoring whole dataset from backup.
MariaDB configuration parameters:
sync_binlog = 0
|
binlog_cache_size = 32M
|
default_storage_engine = innodb
|
sort_buffer_size = 4M
|
read_rnd_buffer_size = 4M
|
table_open_cache = 120000
|
table_open_cache_instances = 16
|
table_definition_cache = 1800000
|
key_buffer_size = 256M
|
query_cache_size = 0
|
query_cache_limit = 0
|
query_cache_type = 0
|
innodb_buffer_pool_size = 64G
|
innodb_log_buffer_size = 512M
|
innodb_log_file_size = 8G
|
innodb_autoinc_lock_mode = 2
|
innodb_lock_wait_timeout = 20
|
innodb_numa_interleave = 1
|
innodb_flush_log_at_trx_commit = 2
|
innodb_io_capacity_max = 90000
|
innodb_io_capacity = 90000
|
innodb-print-all-deadlocks = ON
|
innodb_flush_neighbors = 0
|
innodb_use_native_aio = 0
|
innodb_change_buffering = 'none'
|
wsrep_on
|
Parameters changed due to upgrade (for incompatible changes mentioned in release notes):
31c31
|
< innodb_adaptive_hash_index = 0
|
---
|
> #deprecated in 10.5 innodb_adaptive_hash_index = 0
|
44,45c44,45
|
< innodb_read_io_threads = 16
|
< innodb_write_io_threads = 16
|
---
|
> #broken 10.5 innodb_read_io_threads = 16
|
> #broken 10.5 innodb_write_io_threads = 16
|
47c47
|
< innodb_buffer_pool_instances = 32
|
---
|
> #deprecated in 10.5 innodb_buffer_pool_instances = 32
|
51c51
|
< innodb_log_optimize_ddl = 0
|
---
|
> #deprecated in 10.5 innodb_log_optimize_ddl = 0
|
The issue: Innodb doesn't support multiple buffer pool instances anymore
innodb_buffer_pool_instances is now deprecated option. This was known and announced, but the effect of violating the rule of innodb_read_io_threads+innodb_write_io_threads=innodb_buffer_pool_instances is dramatic. Mariadb was not able to handle the load it used to.
innodb_read_io_threads and innodb_write_io_threads default values are 4+4, while there's the only buffer pool. Even limiting them to 1+1 will not solve the issue. This causes numerous query cross-blocks as pool instance is always single, as it was mentioned to be a measure to improve the performance.
Unfortunately, there's no replacement parameter to be adjusted in order to overcome the issue.
The server was not able to handle desired load of parallel writes (while same server was easily handling same load before the upgrade and after reverting the changes back).
- Number of threads running (that used to be around 7-10 normally) jumped to 100 and sometimes reached 250, number of threads connected that is normally within 60-100 jumped to 300-450.
- Rate of select queries remained the same (~120 qps), but rate of insert/update queries dropped from 100 to 20 qps.
- Number of active queries in the processlist changed from 7-10 (with longest query time of 40 seconds) to 30-50 with longest query running about 900 seconds. Reason for the latest one was that queries timed out (increasing lock wait timeouts and session timeouts from 20 to 45 gave no positive effect), and rollback took ages due to being unable to gain a lock.
- Normally, mariadb has more than one table cache instance of 120K tables open. But after upgrade even one table cache instance was not utilized completely (reached 110K open tables only).
- Normally, mariadb 10.4 was able to perform frequent or even parallel inserts even into same table (tables have size of up to 100GB). MariaDB 10.5 was not able to perform frequent (every 3 seconds) writes into small table (17MB) with no parallel writes (queries stuck, timed out and one was in trx_state : ROLLING BACK for 860 seconds!)
- SHOW ENGINE INNODB STATUS output is full of messages like following one:
--Thread 140045441414720 has waited at ha_innodb.cc line 14299 for 0.00 seconds the semaphore:
Mutex at 0x55c00ead3340, Mutex DICT_SYS created /var/tmp/portage/dev-db/mariadb-10.5.13/work/mysql/storage/innobase/dict/dict0dict.cc:1027, lock
however, this seems to have no relation to the issue.
Studying innodb-system-variables list for new or not configured parameters indicates that there are lots of deprecations, but no adjustment for handling high parallelism anymore.
The question is - are there any (other) replacement settings to overcome the issue caused by fixed single pool instance?
Attachments
Issue Links
- includes
-
MDEV-23136 InnoDB init fail after upgrade from 10.4 to 10.5
- Closed
- relates to
-
MDEV-25020 SELECT if there is IN clause with binary UUID in binary form is extremely slow since MariaDB 10.5.9
- Closed
-
MDEV-26445 innodb_undo_log_truncate is unnecessarily slow
- Closed
-
MDEV-28518 After update to 10.5 a lot of time is spent in "Sending data"
- Closed
-
MDEV-30390 MariaDB 10.5 gets stuck on "Too many connections"
- Closed