[MDEV-30775] Performance regression in fil_space_t::try_to_close() introduced in MDEV-23855 Created: 2023-03-02 Updated: 2023-05-24 Resolved: 2023-03-10 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.5.7, 10.5.8, 10.5.17, 10.11.2 |
| Fix Version/s: | 10.11.3, 10.5.20, 10.6.13, 10.8.8, 10.9.6, 10.10.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Claudio Nanni | Assignee: | Vladislav Lesin |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | regression-10.5 | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Description |
|
Several operations got much slower starting from 10.5.8 and apparently related to file opening/closing in handler.cc when reaching innodb_open_files limit. The more tables you have in the instance the more visible, with some tens of thousands of tables it gets problematic. SHOW TABLE STATUS FROM db gets hundreds of times slower mysqldump on 600 schemas x 223 empty tables goes from 2:45 minutes to 27 minutes Increasing innodb_open_files (with all the capping limits open_files_limit, os user limits, system limits) helps but in some case means keeping hundreds of thousands of files open. The change was introduced in CS 10.5.8 To reproduce you need many thousands of tables(at least higher than innodb_open_files) and execute mysqldump or SHOW TABLE STATUS FROM until the limit is reached, testing with and without increasing innodb_open_files will show the difference. I have found this if condition in handler.cc getting slower, the body is not even executed:
That open() call goes to ha_heap.cc calling:
|
| Comments |
| Comment by Vladislav Lesin [ 2023-03-06 ] | ||||||||||||||||||||||||||||||||||||||||||
|
perf shows that the most of the time is spent on iterating over fil_system.space_list in fil_space_t::try_to_close(). The reason of this is that fil_node_open_file_low() tries to close files from the top of the list if the number of opened files is close to limit. And it closes the files from the top of the list, what makes fil_space_t::try_to_close() to iterate more and more closed files from the top of the list to find opened file to close:
I think the easiest fix would be in moving closed node to the end of fil_system.space_list in fil_space_t::try_to_close(). At the other hand, fil_system.space_list is FIFO list, and closed files will be at the top of the list sooner or later(but they will be mixed with opened nodes, what let fil_space_t::try_to_close() to do less iterations during searching opened node). So the better solution would be just in splitting this list into the lists of closed and opened nodes. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-03-07 ] | ||||||||||||||||||||||||||||||||||||||||||
|
The change looks good to me. This should be OK to push after some extra stress testing with innodb_open_files=10. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Matthias Leich [ 2023-03-10 ] | ||||||||||||||||||||||||||||||||||||||||||
|
origin/st-10.5- | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Claudio Nanni [ 2023-03-10 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Tested the change with both SHOW TABLE STATUS and mysqldump both are very fast even with innodb_open_files=10 | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-03-31 ] | ||||||||||||||||||||||||||||||||||||||||||
|
On WordPress.com we have been bitten by this regression as well after upgrading from MariaDB 10.4.28 to 10.6.12. We have been testing 10.6.13 with the patch for this issue but it doesn't seem to resolve the problem. We still have very poor performance with all of the CPU time spent in `fil_space_t::try_to_close`. We have tried various values for `innodb_open_files` from 10 to 100000 and regardless of the value, once we exceed it, performance degrades significantly. Is there some additional information that would be helpful from our environment to help fully resolve this issue? I'm not really sure how the fix was tested as it doesn't seem to perform better in our tests. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-03-31 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, I am sorry to read that. As you probably are aware, MariaDB duplicates some convoluted MySQL logic for adjusting innodb_open_files<10 at startup to something larger based on some other parameters. So, the practical minimum value indeed is innodb_open_files=10. A reasonable minimum value could be much larger. There used to be a fil_system.LRU list that keeps track of most recently accessed InnoDB data files. The innodb_open_files was being enforced by closing the least recently used file. Maintaining this list required fil_system.mutex to be acquired and released once for each page I/O, which made fil_system.mutex a performance bottleneck. in The LRU "replacement" (file handle closing) policy was replaced something like a FIFO policy. claudio.nanni provided a test case with which this policy led to frequent closing and reopening of a busy file. vlad.lesin tweaked the FIFO policy a little, and the fix was extensively tested in the 10.5 branch. There were some conflicts when merging to the 10.6 branch. I think that we’d better run our test case on the 10.6 branch once more next week. If it does not reproduce the problem, we would like to get some form of executable test case from you. Write performance is important to us. With | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-03-31 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, we tested 10.5 fix, and it showed good results with innodb_open_files=10 and ~133000 tables. Maybe we lost something during merging the fix to 10.6, I will check it. There is one more thing with could affect. Did you do queries to INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES or INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-04-03 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Thanks for looking. We also now have a crash is 10.6.13 which seems related to this patch
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2023-04-04 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I am a bit confused about one thing in the original report. Heap will not open any files, so the number of possible open file descriptors should have no effect on this call. What am I missing? As Claudio reported, we have not been able to repeat this with 133000 tables. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Claudio Nanni [ 2023-04-04 ] | ||||||||||||||||||||||||||||||||||||||||||
|
> What am I missing? In my step-by-step debug setting various breakpoints I saw that this test condition:
Was calling:
The body of the if was not executed in my tests, only the condition was checked which does a call to open() Of course is possible I did some mistake but I tested it many times and I can set it up again if needed. The table distribution was around 600 identical schemas with 223 (empty) tables each. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Claudio Nanni [ 2023-04-04 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I quickly redid a test with 10.5.17 and 10.5.20, I can confirm that 10.5.20 is exempt from the problem (see attached). | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-04 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, I reproduced performance regression on 10.6, and is working now on the fix. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-04-06 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Thank you! Please let me know if you want me to test anything. I have a somewhat easy way to reproduce. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-07 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, I have found the reason and fixed it. The fix is under testing now. We can provide you with custom build with the fix after the testing. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-04-07 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Thanks! We can build from a Github hash/branch so just let us know when/what to test. No need to provide us a custom build. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-11 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, you can try https://github.com/MariaDB/server/tree/bb-10.6-MDEV-30775-try_to_close. It passed regression testing, but hasn't yet been tested with RQG. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-04-11 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Thanks. I tested this but it doesn't seem to fix the issue. It's easier and less disruptive for us to test on a replica vs a primary DB so I built MariaDB from your branch and upgraded a replica that was running 10.6.12. I set `innodb_open_files` to 10 so we can replicate the problem faster and started the replication IO and SQL threads. The replica can't open/close tables fast enough for replication to keep up (there aren't very many writes). I expect the same issue to happen on a primary as well. perf shows all CPU time spent in `fil_space_t::try_to_close`.
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Claudio Nanni [ 2023-04-12 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, What type of operation is resulting in such profile? In my tests the fix works great, just tested SHOW TABLE STATUS from 3200 schemas with 223 tables each (around 700k tables) and each execution is extremely fast (~0.1s). real 4m22,851s Also mysqldump that was extremely slow became very fast. Could you attach SHOW GLOBAL VARIABLES; if that doesn't contain any sensitive information? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-04-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
This is from a server running a single replication sql thread. Normal traffic, no specific test. 7090 schemas, 389072 tables in total. SHOW GLOBAL VARIABLES attached with some host-specific information removed. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-04-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, could you also please show more detail where fil_space_t::try_to_close() is spending the time? In perf report or perf top, hit 'a' for annotating the code (showing the instructions where most time is being spent). There is also some other command for toggling source code view, which could be useful. I am not sure if it is related to this, but I have the feeling that it could be useful to have a larger fil_system.spaces.n_cells than the hard-coded 50767 when you have hundreds of thousands of InnoDB tablespaces. If you use partitioning, each partition or subpartition counts as a separate InnoDB table. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-04-13 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Annotated perf attached. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
We found | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, From the "SHOW GLOBAL VARIABLES": version_source_revision 4c79e15cc3716f69c044d4287ad2160da8101cdc. This corresponds to mariadb-10.6.12. The latest commit of bb-10.6- | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-04-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
In my experience, version_source_revision may not always (or at all) be updated by incremental builds. That might explain the finding. I did not try to match the disassembly of the annotated fil_space_t::try_to_close() with any source code. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
According to the unfolded fil_space_t::try_to_close() in annotated-perf.txt
This corresponds to FIL_TYPE_TEMPORARY case in the following switch:
The following snippet:
Corresponds to:
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
The fix was based on the case when fil_system.space_list contained a lot of closed spaces on its top, and fil_space_t::try_to_close() had to iterate them to find first opened space. The above comment means that not only this can be the cause of slow fil_space_t::try_to_close() work, but also a big amount of temporary tables at the top of fil_system.space_list. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-04-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
For the record, there should be only one fil_system.temp_space, but it may contain multiple files if innodb_temp_data_file_path specifies so. However, the fil_system.space_list should contain fil_system.temp_space only once. It might make sense to revise the data structures to have only FIL_TYPE_TABLESPACE in this list. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-17 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, could you please confirm you are testing 7c0af5c56994f37c45d40e45fa4e15743acaeb06 commit? BTW, the changes were pushed to 10.6, so you can use the latest 10.6 from repository to test. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-04-19 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Sorry for the delay and confusion - yes the tests were done on the correct version but since it didn't resolve the issue I switched back to 10.6.12 before I provided the output of SHOW GLOBAL VARIABLES. I can test again using the 10.6 branch. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Lesin [ 2023-04-21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, could you please build latest 10.6(it must contain 0cca8166f3111901019dcd33747a1a1dfd9e66d1 commit) with -DCMAKE_BUILD_TYPE=RelWithDebInfo option, then test it, and, if it still has low performance: 1. attach to the process with gdb, i.e. "gdb -p mariadbd_pid", 2. set gdb logging to file and turn off gdb pagination, i.e.
3. load gdb space_list.gdb
4. launch function from the above gdb file:
5. quit from gdb(Ctrl+D), and share 10.6-MDEV-30775-gdb.log with us. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Barry [ 2023-05-05 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Sorry for the delayed reply. I can confirm latest 10.6 (fe89df42686fd41e986dc775e12ad6f3594d5bca) works fine for me. Thanks for all of the help resolving this issue! | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-05-08 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Barry, thank you for bringing up the issue well in time for the fix to be included in the quarterly releases. |