[MDEV-27530] InnoDB - Performance issues after upgrade 10.4.22 to 10.5.13 - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Incomplete
Affects Version/s: 10.5.13
Fix Version/s: N/A
Component/s: Storage Engine - InnoDB
Labels:
None
Environment:
Linux 5.10.91-gentoo #1 SMP x86_64 AMD EPYC 7451 AuthenticAMD GNU/Linux
Dedicated raid1 nvme0n1p1[0] nvme1n1p1[2] for /var/lib/mysql directory
256G RAM
galera (26.4.10) cluster configured, but only one node was started after upgrade.

Description

Attempt to upgrade from 10.4.22 to 10.5.13 turned into huge performance issues and ended with rolling the upgrade back with restoring whole dataset from backup.

MariaDB configuration parameters:

sync_binlog                             = 0

binlog_cache_size                       = 32M

default_storage_engine                  = innodb

sort_buffer_size                        = 4M

read_rnd_buffer_size                    = 4M

table_open_cache                        = 120000

table_open_cache_instances              = 16

table_definition_cache                  = 1800000

key_buffer_size                         = 256M

query_cache_size                        = 0

query_cache_limit                       = 0

query_cache_type                        = 0

innodb_buffer_pool_size                 = 64G

innodb_log_buffer_size                  = 512M

innodb_log_file_size                    = 8G

innodb_autoinc_lock_mode                = 2

innodb_lock_wait_timeout                = 20

innodb_numa_interleave                  = 1

innodb_flush_log_at_trx_commit          = 2

innodb_io_capacity_max                  = 90000

innodb_io_capacity                      = 90000

innodb-print-all-deadlocks              = ON

innodb_flush_neighbors                  = 0

innodb_use_native_aio = 0

innodb_change_buffering = 'none'

wsrep_on

Parameters changed due to upgrade (for incompatible changes mentioned in release notes):

31c31

< innodb_adaptive_hash_index = 0

---

> #deprecated in 10.5 innodb_adaptive_hash_index = 0

44,45c44,45

< innodb_read_io_threads = 16

< innodb_write_io_threads = 16

---

> #broken 10.5 innodb_read_io_threads = 16

> #broken 10.5 innodb_write_io_threads = 16

47c47

< innodb_buffer_pool_instances = 32

---

> #deprecated in 10.5 innodb_buffer_pool_instances = 32

51c51

< innodb_log_optimize_ddl = 0

---

> #deprecated in 10.5 innodb_log_optimize_ddl = 0

The issue: Innodb doesn't support multiple buffer pool instances anymore

innodb_buffer_pool_instances is now deprecated option. This was known and announced, but the effect of violating the rule of innodb_read_io_threads+innodb_write_io_threads=innodb_buffer_pool_instances is dramatic. Mariadb was not able to handle the load it used to.
innodb_read_io_threads and innodb_write_io_threads default values are 4+4, while there's the only buffer pool. Even limiting them to 1+1 will not solve the issue. This causes numerous query cross-blocks as pool instance is always single, as it was mentioned to be a measure to improve the performance.

Unfortunately, there's no replacement parameter to be adjusted in order to overcome the issue.

The server was not able to handle desired load of parallel writes (while same server was easily handling same load before the upgrade and after reverting the changes back).

Number of threads running (that used to be around 7-10 normally) jumped to 100 and sometimes reached 250, number of threads connected that is normally within 60-100 jumped to 300-450.
Rate of select queries remained the same (~120 qps), but rate of insert/update queries dropped from 100 to 20 qps.
Number of active queries in the processlist changed from 7-10 (with longest query time of 40 seconds) to 30-50 with longest query running about 900 seconds. Reason for the latest one was that queries timed out (increasing lock wait timeouts and session timeouts from 20 to 45 gave no positive effect), and rollback took ages due to being unable to gain a lock.
Normally, mariadb has more than one table cache instance of 120K tables open. But after upgrade even one table cache instance was not utilized completely (reached 110K open tables only).
Normally, mariadb 10.4 was able to perform frequent or even parallel inserts even into same table (tables have size of up to 100GB). MariaDB 10.5 was not able to perform frequent (every 3 seconds) writes into small table (17MB) with no parallel writes (queries stuck, timed out and one was in trx_state : ROLLING BACK for 860 seconds!)

SHOW ENGINE INNODB STATUS output is full of messages like following one:

--Thread 140045441414720 has waited at ha_innodb.cc line 14299 for 0.00 seconds the semaphore:

Mutex at 0x55c00ead3340, Mutex DICT_SYS created /var/tmp/portage/dev-db/mariadb-10.5.13/work/mysql/storage/innobase/dict/dict0dict.cc:1027, lock

however, this seems to have no relation to the issue.

Studying innodb-system-variables list for new or not configured parameters indicates that there are lots of deprecations, but no adjustment for handling high parallelism anymore.

The question is - are there any (other) replacement settings to overcome the issue caused by fixed single pool instance?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

galera_upgrade_fail_combined_chart.png
868 kB
2022-01-18 11:46
screenshot-1.png
65 kB
2022-09-29 03:12

Issue Links

includes

MDEV-23136 InnoDB init fail after upgrade from 10.4 to 10.5

Closed

relates to

MDEV-25020 SELECT if there is IN clause with binary UUID in binary form is extremely slow since MariaDB 10.5.9

Closed

MDEV-26445 innodb_undo_log_truncate is unnecessarily slow

Closed

MDEV-28518 After update to 10.5 a lot of time is spent in "Sending data"

Closed

MDEV-30390 MariaDB 10.5 gets stuck on "Too many connections"

Closed

Activity

Ascending order - Click to sort in descending order

View 16 older comments

Eugene added a comment - 2022-09-29 10:33

This was most probably one of those limits I mentioned in the comment some time ago.

open_files_limit                        = 10000000

table_open_cache                        = 3300000

table_definition_cache                  = 3300000

innodb_open_files                       = 3300000

Once number of `innodb_open_files` changed, situation dramatically improved. Until it remained default, there was no positive effect of table_open_cache/table_definition_cache adjustment.
And the performance killer is file closing process in fact. Once you get rid of necessity of closing files, performance improves and becomes almost the same it used to be with 10.4. Thus, having huge cache "solves" the problem. In fact it just postpones the problem until the moment one has more tablespace files opened than `innodb_open_files` value. In our case this is 3.3M files, but the dataset is really large and we have that much tables, so hit the performance problem immediately - on even running `mysql_upgrade` script that checks all the tables while `innodb_open_files` used to be default. By the way, never needed to touch it until upgrading to 10.5.

So, number of open tables (controlled by `innodb_open_files` `table_open_cache` `open_files_limit`) is the limit visible on that chart and increasing it "solves" the issue. But disadvantages are: first is having all the tables opened all the time. It might turn into consistency problem some day, I suspect. And the second (and not yet investigated) problem is - what will happen when you decide to rotate partitions on partitioned tables? As every partition is a file, so there will be creation (and opening, thus, using more cache entries) of new partitions and closing older partitions. So during this procedure there might be performance drop, and it's not clear whether performance will become normal once all the partitions in all the tables are rotated.

Eugene added a comment - 2022-09-29 10:33 This was most probably one of those limits I mentioned in the comment some time ago. open_files_limit = 10000000 table_open_cache = 3300000 table_definition_cache = 3300000 innodb_open_files = 3300000 Once number of `innodb_open_files` changed, situation dramatically improved. Until it remained default, there was no positive effect of table_open_cache/table_definition_cache adjustment. And the performance killer is file closing process in fact. Once you get rid of necessity of closing files, performance improves and becomes almost the same it used to be with 10.4. Thus, having huge cache "solves" the problem. In fact it just postpones the problem until the moment one has more tablespace files opened than `innodb_open_files` value. In our case this is 3.3M files, but the dataset is really large and we have that much tables, so hit the performance problem immediately - on even running `mysql_upgrade` script that checks all the tables while `innodb_open_files` used to be default. By the way, never needed to touch it until upgrading to 10.5. So, number of open tables (controlled by `innodb_open_files` `table_open_cache` `open_files_limit`) is the limit visible on that chart and increasing it "solves" the issue. But disadvantages are: first is having all the tables opened all the time. It might turn into consistency problem some day, I suspect. And the second (and not yet investigated) problem is - what will happen when you decide to rotate partitions on partitioned tables? As every partition is a file, so there will be creation (and opening, thus, using more cache entries) of new partitions and closing older partitions. So during this procedure there might be performance drop, and it's not clear whether performance will become normal once all the partitions in all the tables are rotated.

Marko Mäkelä added a comment - 2022-09-30 12:53

euglorg, if you collected more data in the style of the graphs in ~~MDEV-23855~~, you might confirm my finding in ~~MDEV-25215~~. Quoting a comment from the ticket:

I see that your innodb_log_file_size is … a fraction of the innodb_buffer_pool_size. That will force very frequent log checkpoints, which in turn will cause stalls. That could actually be the root cause of your messages, if those messages always say "0 pending operations and pending fsync". I do not think that there is any need to execute fsync() or fdatasync() outside log checkpoints. Starting with 10.5, thanks to ~~MDEV-19176~~, recovery should work fine even if you make the redo log as big as the buffer pool, or possibly even larger.

You did not provide any excerpt from your server error log. Do you find messages like this there:

2022-01-11 9:27:33 22 [Note] InnoDB: Cannot close file ./ownbfmprd/stf_stat_availability_office_bench_met.ibd because of 1 pending operations

The "pending operations" or "pending fsync" would be initiated for the log checkpoint.

You can either make the open file limits larger, or you could use a much larger innodb_log_file_size so that checkpoints are less frequent. You might also want to enable background flushing (see ~~MDEV-27295~~), to avoid larger I/O spikes when a log checkpoint is urgently needed. The parameter innodb_io_capacity throttles the background flushing speed. I see that you already specify a reasonably large value for it.

Would a larger innodb_log_file_size solve the problem for you?

Marko Mäkelä added a comment - 2022-09-30 12:53 euglorg , if you collected more data in the style of the graphs in MDEV-23855 , you might confirm my finding in MDEV-25215 . Quoting a comment from the ticket: I see that your innodb_log_file_size is … a fraction of the innodb_buffer_pool_size . That will force very frequent log checkpoints, which in turn will cause stalls. That could actually be the root cause of your messages, if those messages always say "0 pending operations and pending fsync". I do not think that there is any need to execute fsync() or fdatasync() outside log checkpoints. Starting with 10.5, thanks to MDEV-19176 , recovery should work fine even if you make the redo log as big as the buffer pool, or possibly even larger. You did not provide any excerpt from your server error log. Do you find messages like this there: 2022-01-11 9:27:33 22 [Note] InnoDB: Cannot close file ./ownbfmprd/stf_stat_availability_office_bench_met.ibd because of 1 pending operations The "pending operations" or "pending fsync" would be initiated for the log checkpoint. You can either make the open file limits larger, or you could use a much larger innodb_log_file_size so that checkpoints are less frequent. You might also want to enable background flushing (see MDEV-27295 ), to avoid larger I/O spikes when a log checkpoint is urgently needed. The parameter innodb_io_capacity throttles the background flushing speed. I see that you already specify a reasonably large value for it. Would a larger innodb_log_file_size solve the problem for you?

Eugene added a comment - 2022-09-30 13:26

Sorry for not updating that ticket, Marko. Unfortunately, we have had to downgrade to 10.4 everywhere, so at the moment it's impossible to check. However, I can't remember lines like you mentioned logged my mariadbd.
As for innodb_log_file_size, on next upgrade attempt I'll try to set it to 64 or 128GB and check the behavior. Can't provide any ETA, however.

Eugene added a comment - 2022-09-30 13:26 Sorry for not updating that ticket, Marko. Unfortunately, we have had to downgrade to 10.4 everywhere, so at the moment it's impossible to check. However, I can't remember lines like you mentioned logged my mariadbd. As for innodb_log_file_size, on next upgrade attempt I'll try to set it to 64 or 128GB and check the behavior. Can't provide any ETA, however.

Marko Mäkelä added a comment - 2022-09-30 14:35

euglorg, the checkpoints and page flushing work quite differently before 10.5, and crash recovery may run out of memory if the log file is more than about ⅓ of the buffer pool size.

In 10.9 thanks to ~~MDEV-27812~~, you would not need a server restart for changing the size of the log.

Marko Mäkelä added a comment - 2022-09-30 14:35 euglorg , the checkpoints and page flushing work quite differently before 10.5, and crash recovery may run out of memory if the log file is more than about ⅓ of the buffer pool size. In 10.9 thanks to MDEV-27812 , you would not need a server restart for changing the size of the log.

Sergei Golubchik added a comment - 2022-10-30 20:13

euglorg, as a rule we close tickets that got no feedback for a month. But worry not, we'll reopen it if there will be new info, for example, after your new attempt to upgrade.

Sergei Golubchik added a comment - 2022-10-30 20:13 euglorg , as a rule we close tickets that got no feedback for a month. But worry not, we'll reopen it if there will be new info, for example, after your new attempt to upgrade.

MariaDB Server

InnoDB - Performance issues after upgrade 10.4.22 to 10.5.13

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration