[MDEV-27530] InnoDB - Performance issues after upgrade 10.4.22 to 10.5.13 Created: 2022-01-17 Updated: 2023-01-12 Resolved: 2022-10-30 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.5.13 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Eugene | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 5 |
| Labels: | None | ||
| Environment: |
Linux 5.10.91-gentoo #1 SMP x86_64 AMD EPYC 7451 AuthenticAMD GNU/Linux |
||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Description |
|
Attempt to upgrade from 10.4.22 to 10.5.13 turned into huge performance issues and ended with rolling the upgrade back with restoring whole dataset from backup. MariaDB configuration parameters:
Parameters changed due to upgrade (for incompatible changes mentioned in release notes):
The issue: Innodb doesn't support multiple buffer pool instances anymore innodb_buffer_pool_instances is now deprecated option. This was known and announced, but the effect of violating the rule of innodb_read_io_threads+innodb_write_io_threads=innodb_buffer_pool_instances is dramatic. Mariadb was not able to handle the load it used to. Unfortunately, there's no replacement parameter to be adjusted in order to overcome the issue. The server was not able to handle desired load of parallel writes (while same server was easily handling same load before the upgrade and after reverting the changes back).
Studying innodb-system-variables list for new or not configured parameters indicates that there are lots of deprecations, but no adjustment for handling high parallelism anymore. The question is - are there any (other) replacement settings to overcome the issue caused by fixed single pool instance? |
| Comments |
| Comment by Vladislav Vaintroub [ 2022-01-17 ] | ||||||
|
Why is innodb_xxx_io_threads "broken"?
I do not think it is anything that is either deprecated or supposed to be broken. It has a slightly different meaning, so what? There will be 16 threads handling IO completions at the same time, if IO completions can saturate 16 threads. Also I think you're concentrated too much on what you believe was the culprit, the single buffer pool instance. As far as I remember it faired well, and better than 10.4 in most, if not all performance tests, including specifically those that had multiple parallel updates, and hope "innodb_read_io_threads+innodb_write_io_threads=innodb_buffer_pool_instances" rule does not really make any sense. There should be as many IO threads as there is sufficient to finish IO completion fast, possibly without disturbing the "foreground threads", handling users connections. It has, or had nothing to do with buffer pool instances. innodb_page_cleaners had something to do with buffer pool instance count. | ||||||
| Comment by Eugene [ 2022-01-17 ] | ||||||
| ||||||
| Comment by Eugene [ 2022-01-18 ] | ||||||
|
The chart attached illustrates what was written already.
| ||||||
| Comment by Marko Mäkelä [ 2022-02-21 ] | ||||||
|
Have you tried a larger innodb_log_file_size? It currently only is 1/8 of the buffer pool size. Using a larger redo log file would make log checkpoints (and related page writes) much less frequent. You might also try tweaking the innodb_lru_scan_depth and innodb_lru_flush_size if the working set does not fit in the buffer pool. | ||||||
| Comment by Eugene [ 2022-02-21 ] | ||||||
|
Thank you for the advice, Marko! Will try to increase innodb_log_file_size and tune innodb_lru* variables on next attempt. Unfortunately, there's no chance to fit working set into memory, it's several times bigger. Most probably, the option to test this will appear in a couple of weeks, will let you know about the result. | ||||||
| Comment by Marko Mäkelä [ 2022-03-30 ] | ||||||
|
Could this be a duplicate of | ||||||
| Comment by Eugene [ 2022-05-03 ] | ||||||
|
Some more details found on this.
Total number of files in the dataset is over 4 millions. But in fact number of open files was slightly over about 240000 (two instances used). So the issue seems to be within the field of number of IO threads for reading and writing, amount of open files and also size of table cache. With small cache the issue is caused with flushing tables and opening new, but with the big one it's caused by handling big array of open files. An interesting point is that this issue never was that bad with 10.4 when it was possible to manually adjust number of threads. | ||||||
| Comment by Marko Mäkelä [ 2022-05-04 ] | ||||||
|
euglorg, thank you for your update. The table_open_cache and table_definition_cache are only loosely related to InnoDB. My understanding is that table_open_cache controls the caching of open table handles (objects of the abstract class handler) for quick reuse. That is, if multiple SQL statements access the same InnoDB table in succession, an already created ha_innobase object could be reused. If I understand correctly, both caches should be emptied by a FLUSH TABLES command. The impact should be that on subsequent access, the tablename.frm file will be loaded (to add a TABLE_SHARE to table_definition_cache) and the table will be opened in InnoDB as well. The InnoDB internal data dictionary cache will not be directly affected by FLUSH TABLES. There is a LRU eviction mechanism in InnoDB. Each partition, subpartition, or unpartitioned table counts as a separate dict_table_t object in InnoDB. The function dict_make_room_in_cache() will be periodically invoked to try to keep at most table_definition_cache objects allocated inside the InnoDB dict_sys cache. If any handle to a table is open (see table_open_cache), the table definition cannot be evicted from InnoDB. Reloading table definitions to InnoDB should involve accessing InnoDB internal data dictionary tables. I could imagine that on a busy server, those pages would typically not be cached in the buffer pool. Reloading table definitions should only ‘punish’ the execution of SQL statements that need to access some affected tables. If you are observing reduced throughput for all queries across the server, there is a more likely explanation. There is one more mechanism that I think is more likely to cause the regression that you observe. There is some logic like this at InnoDB startup, inherited from MySQL 5.6.6:
The commit message by svoj mentions WL#6216, but that ticket is private. Anyway, you can see that unless you specify innodb_open_files to be larger than 10 (its default value is 0), it will be auto-adjusted to table_open_cache (tc_size), provided that its value is between 300 and open_files_limit. This feels rather arbitrary to me, especially given the fact that each partitioned table likely counts as one in the table caches outside storage engines, and as several tables or files inside storage engines. In any case, this logic was not changed recently. What was changed in MariaDB Server 10.5 is the way how modified pages are written back to data files. I would recommend you to check | ||||||
| Comment by Ludovic Monnier [ 2022-05-27 ] | ||||||
|
We observed the same issue since Maria 10.5 using a CREATE TABLE Test SELECT ... FROM... GROUP BY.... syntax and only a SELECT ... FROM... GROUP BY.... syntax Very poor performance using mariadb 10.5 and after !! I check the same request with same data using few version of mariadb : #Version: 10.3.34-MariaDB-0+deb10u1-log (Debian 10).
#Version: 10.4.24-MariaDB-1:10.4.24+maria~buster-log (mariadb.org binary distribution) Avant derniere 10.4
#Version: 10.4.25-MariaDB-1:10.4.25+maria~buster-log (mariadb.org binary distribution)
With update 10.5.* , 10.6.* and 10.7.* :
Since 10.5.* we can observe a large tempory file .MAI during the request. This one was empty (only few Ko) using mariadb 10.4 and before. | ||||||
| Comment by Eugene [ 2022-06-08 ] | ||||||
|
It seems that workaround for the problem is - let the mariadbd have all the files of dataset open in same time.
with corresponding system limit of open files for the process. For our setup this rendered into having 2.7 milliones Also checked what's happening with mariadb while adjusting settings. Everything runs quickly while it's possible to open another table without closing already open one. Once limit is hit (any of the above listed variables limiting number of open files) there's a necessity to flush and close the table. This process is very slow and locks the thread. Thus, while one table is being closed, no option to perform any operation on another one. On 10.4 this didn't happen: there were several IO writing and reading threads operating with different parts of buffer_pool. While on 10.5 and newer, there's just one big pool and once one thread operating with it is locked, nothing can be done until this lock is released. Anyway. Upgrade to 10.5 or newer required having all the tables open in same time. Thus, startup, shutdown and any file-related operations are slow due to huge amount of open files. It's hard to imagine how it would perform on setup with 100M tables and intensive writing. As first table to flush will freeze the daemon. Moreover, on galera cluster it triggers flowcontrol, pauses replication and can stop whole cluster (this can be easily simulated with flushing table cache manually, but after that you will either have long downtime or kill the daemon to unlock other nodes and perform SST to the node you decided to experiment on). | ||||||
| Comment by Marko Mäkelä [ 2022-08-01 ] | ||||||
|
euglorg, did you try the advice that was suggested in
ludovic.monnier, I do not think that much has changed with respect to the Aria storage engine. Your observation could be explained by the change of some query plan, or what type of statistics are being collected and when. I do not know code outside InnoDB very well, but I would think that the Aria engine may be used as a temporary table during query execution, to compute some joins or ORDER BY or GROUP BY. Can you please file a separate bug for that, with enough detail so that the regression can be reproduced and fixed? | ||||||
| Comment by Joris de Leeuw [ 2022-09-22 ] | ||||||
|
This issue seems to be fixed with | ||||||
| Comment by Ludovic Monnier [ 2022-09-26 ] | ||||||
|
Hi, I tried to update 10.3.34 to the 10.5.17 version. Using Version: 10.3.34 is much faster than 10.5.*: 600 seconds VS 1972. Using version 10.3.34:
Using version 10.5.17 :
| ||||||
| Comment by Joris de Leeuw [ 2022-09-26 ] | ||||||
|
@Ludovic Monnier. Do you see any positive difference between 10.5.16 and 10.5.17? Is there any way to reproduce your issue? It seems you might have an other issue. With the example of Sander Hoentjen shared in | ||||||
| Comment by Ludovic Monnier [ 2022-09-26 ] | ||||||
|
Hi, None difference between 10.5.* and 10.5.17. it's difficult to reproduce the same issue because table is large (82 160 658 rows) I'll try to send you a sample database to reproduce it. | ||||||
| Comment by Roel Van de Paar [ 2022-09-29 ] | ||||||
|
I noticed especially this difference before/after Joriz how did your | ||||||
| Comment by Eugene [ 2022-09-29 ] | ||||||
|
This was most probably one of those limits I mentioned in the comment some time ago.
Once number of `innodb_open_files` changed, situation dramatically improved. Until it remained default, there was no positive effect of table_open_cache/table_definition_cache adjustment. So, number of open tables (controlled by `innodb_open_files` | ||||||
| Comment by Marko Mäkelä [ 2022-09-30 ] | ||||||
|
euglorg, if you collected more data in the style of the graphs in
You did not provide any excerpt from your server error log. Do you find messages like this there:
The "pending operations" or "pending fsync" would be initiated for the log checkpoint. You can either make the open file limits larger, or you could use a much larger innodb_log_file_size so that checkpoints are less frequent. You might also want to enable background flushing (see Would a larger innodb_log_file_size solve the problem for you? | ||||||
| Comment by Eugene [ 2022-09-30 ] | ||||||
|
Sorry for not updating that ticket, Marko. Unfortunately, we have had to downgrade to 10.4 everywhere, so at the moment it's impossible to check. However, I can't remember lines like you mentioned logged my mariadbd. | ||||||
| Comment by Marko Mäkelä [ 2022-09-30 ] | ||||||
|
euglorg, the checkpoints and page flushing work quite differently before 10.5, and crash recovery may run out of memory if the log file is more than about ⅓ of the buffer pool size. In 10.9 thanks to | ||||||
| Comment by Sergei Golubchik [ 2022-10-30 ] | ||||||
|
euglorg, as a rule we close tickets that got no feedback for a month. But worry not, we'll reopen it if there will be new info, for example, after your new attempt to upgrade. |