[MDEV-32027] Opening all .ibd files on InnoDB startup can be slow Created: 2023-08-28 Updated: 2024-02-02 Resolved: 2023-11-17 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.5, 10.6 |
| Fix Version/s: | 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2 |
| Type: | Bug | Priority: | Critical |
| Reporter: | kennyliao | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | performance | ||
| Environment: |
Microsoft Windows |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Description |
|
2023-08-14 9:13:29 0 [Note] InnoDB: Uses event mutexes 2023-08-14 9:13:31 0 [Note] InnoDB: 128 rollback segments are active. MySQL starts slowly, rollback segments and Creating shared tablespace with an interval of over 4 minutes. |
| Comments |
| Comment by Marko Mäkelä [ 2023-08-28 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Can you please run a profiler during the startup, to find out where the time is being spent?
Would the server start up faster if you make innodb_buffer_pool_chunk_size equal to innodb_buffer_pool_size? In MDEV-25340, I recently tested the initialization of a 96GiB buffer pool. It was significantly slower on 10.5 than on 10.6. Would the start-up time with 10.6.15 be faster? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-28 ] | ||||||||||||||||||||||||||||||||||||||||||
|
my.ini It is not yet clear if the slowness is related to the buffer pool initialization. Some profiling data would be needed to understand what is causing the slowdown. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by kennyliao [ 2023-08-28 ] | ||||||||||||||||||||||||||||||||||||||||||
|
2023-08-14 9:13:29 0 [Note] InnoDB: Initializing buffer pool, total size = 5368709120, chunk size = 134217728 Aren't these two prompts indicating that the initialization of the buffer pool was completed within 1 second? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-28 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Yes, the buffer pool initialization does seem to finish that quickly. I was thinking that there could be some longer-term impact on L2 or L3 cache. Anyway, without having any sort of profiling data, it is very hard to say what could cause the slow startup. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-08-28 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I tried a quick test with procmon (the system in question is Windows). What happens between " rollback segments are active." and "Creating shared tablespace for temporary table", is that all tablespaces are opened and closed, with this stack backtrace
On Windows, this results in 5 system calls per table. Speculating a bit - depending on the number of tables, this might be expensive. I remember seeing some systems, where QueryVolumeInformationFile() was not too fast. Anyway, as the information returned by GetFileInformationByHandle() is not going to be used, perhaps init_file_info can also be skipped. The above is about 10.5, I did not look into higher versions. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-08-28 ] | ||||||||||||||||||||||||||||||||||||||||||
|
kennyliao, I think it makes sense to record mysqld.exe's (mariadbd.exe's) filesystem activity with procmon during server's startup in your environment, too, to see if the idea above proves correct. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by kennyliao [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
@Vladislav Vaintroub,The operating system environment is Windows,not linux. Is there any way to accelerate the start of msyqld? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
wlad, thank you for chiming in. I have the impression that opening files is relatively fast in Linux. What is slow is reading from the files (fil_node_t::read_page0()). While I may have noticed that we open and close lots of files during startup, that is why I did not spend much effort in removing those calls. I believe that the logic could be similar in later versions of MariaDB Server. In It should be possible to revise this logic in MariaDB Server 10.6 or later. Thanks to kennyliao, how many .ibd files are there in your data directory? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by kennyliao [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Marko Mäkelä,there are 9511 .ibd files in the data directory, and the size of the database is 27.2GiB. Can upgrading to mariadb10.6 or later solve this performance issue? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
kennyliao, I think that you should encounter the same problem in MariaDB Server 10.6 or later. I would only fix this bug starting with 10.6. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
kennyliao, would it be possible for you to use procmon.exe, to record file system activity of mysqld.exe during database startup, and share the procmon output here? https://download.sysinternals.com/files/ProcessMonitor.zip is the download. it is a legit tool from Microsoft, it works on Windows The cause of slow startup on your system is not yet properly analyzed, and this would help a lot, because slow startup happens for you, but it does not for others. Another helpful information would be when you noticed the slowdown, e.g after upgrading from 5.5. I did a little experiment, with 85 000 empty innodb tables. This added 6-15 seconds startup time for me, on Windows 11, on local SSD, on NTFS. It is not as drastic as your 4 minutes, but I can imagine that there are other factors that add up time, type of storage, file fragmentation etc. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by kennyliao [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I tested using procmon.exe on mariadb 10.6, but there were still cases of file opening and closing. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
procmon allows filtering by executable name (among other things). there is some information about it https://mariadb.com/kb/en/how-to-use-procmon-to-trace-mysqldexe-filesystem-access/ . | ||||||||||||||||||||||||||||||||||||||||||
| Comment by kennyliao [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-08-29 ] | ||||||||||||||||||||||||||||||||||||||||||
|
maybe you can just save procmon output (File/Save), and attach the PML file? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Hartmut Holzgraefe [ 2023-09-12 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I can verify same kind of slowness with many table files, in my case 50 databases with the same two tables having ~2000 partitions each, summing up to a total of 200K .ibd partition tablespace files, on current 10.6 procmon seems to indeed identify QueryVolumeInformationFile() as the worst offender indeed. Attaching an anonymized mysqldump, raw procmon output, and a representative screenshot of a small portion of same | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-09-12 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Hartmut, you can just attach the PML (this is File/Save menu).There is no need to take screenshots of the procmon | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Hartmut Holzgraefe [ 2023-09-12 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Now made the PML url public | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-09-12 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Ok , full PML file is actually helpful. I was wondering about this stacktrace
The first occurrence in this PML file is at 9/12/2023 12:21:50.9988071 PM, the last occurrence at 9/12/2023 12:35:42.7459648 PM.
QueryInformationVolume(.ibd) seems to be access to open file metadata, QueryAllInformationFile too. They do not seem to take longer than others. The startup message is written at 9/12/2023 12:35:43.3938245 PM Afterwards, There is some GetVolumePathName() that's Innodb is loading buffer pool dump ib_buffer_pool and opening corresponding tablespaces. The culprit is opening and closing ibd and isl files, overall. dict_check_tablespaces_and_store_max_id(), as I initially suspected. It takes ~14 minutes. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
hholzgra and wlad, thank you a lot. It looks like opening a file (without reading anything from it) is a much more expensive operation on Microsoft Windows than it is on Linux. I will see if we can remove that code altogether, and just rely on fil_node_t::read_page0() to do the right thing when the files are being opened for the first actual access. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I started fixing this by trimming down dict_check_tablespaces_and_store_max_id() so that it would basically only do SELECT MAX(space) FROM information_schema.innodb_sys_tables at READ UNCOMMITTED isolation level instead of the previous READ COMMITTED, and not bother with any tablespace lookups or opening any files. (Instead of performing even that table scan, we might just rely on the DICT_HDR_MAX_SPACE_ID, but I wanted to retain an option to remove that sequence later.) Some further adjustment is needed to ensure that the data files that were not opened during redo log processing will be opened when the table is first accessed:
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
It turns out that the file test/td.ibd actually contains FSP_SPACE_FLAGS that disagree with the intentionally corrupted table flags, but the fil_ibd_open() call in dict_check_tablespaces_and_store_max_id() never enabled the strict validation, which would be part of dict_load_tablespace(). Furthermore, the following call could confuse users:
On my system, it would report an EINVAL that is not actually directly related to opening the file, but to an expected error during the execution of my_realpath():
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Some tests for encryption functionality depend on opening all InnoDB data files at startup. For example, the test encryption.innodb-spatial-index would not list the tablespace for the system table mysql.transaction_registry in INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION, because it would no longer be opened as part of the server execution. (The test does not make use of any tables WITH SYSTEM VERSIONING.)
An option might be to keep opening all data files of InnoDB tables if an encryption plugin is loaded. We must keep in mind that most encryption parameters, such as innodb_encryption_threads, are settable at runtime. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-11-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Some of encryption functionality, like key rotation, is obscure, and is only supported by very few plugins, therefore it is a good question if | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
If I make dict_check_tablespaces_and_store_max_id() open all files when encryption_key_id_exists(FIL_DEFAULT_ENCRYPTION_KEY) (or !ibuf.empty), then all encryption tests except encryption.innodb-remove-encryption will pass. That still leaves some non-encryption test failures. The trickiest is the following:
This would be fixed by making buf_load() invoke another SYS_TABLES scan and open all data files for those missing tablespaces that are mentioned in the file identified by innodb_buffer_pool_filename. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Before I implemented the tweak to buf_load(), I had tweaked the following tests:
It turns out that my changes to these tests can be omitted, because the ib_buffer_pool file would ensure that the tablespaces of interest be loaded. If the ib_buffer_pool file were removed, these tests would fail. To make these failures deterministic, I would have created all tables with STATS_PERSISTENT=0 and run a slow shutdown (innodb_fast_shutdown=0) so that no background tasks would cause the tables to be loaded and data files to be opened. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-11-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I can imagine that opening the files can be slow also in other environments than Microsoft Windows, such as when the data directory resides on network attached storage or NFS. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2023-11-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
origin/10.6- | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2023-11-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Even on Windows it is not slow. I could not get it as slow, as the original reporter, but 85 000 files add 6-15 seconds startup (Windows seems to caches some metadata, and it is faster second time around). However, delay scales linearly with number of files, so startup with 850000 files can take 2.5 minutes on my machine. The reporter has something that is factors of magnitude slower, and for him it is really much too much. We do not know what kind of storage reporter had, but it would be interesting to know . kennyliao, can you comment on that? |