[MDEV-16283] ALTER TABLE...DISCARD TABLESPACE still takes long on a large buffer pool Created: 2018-05-24 Updated: 2020-08-25 Resolved: 2018-05-29 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB, Storage Engine - XtraDB |
| Affects Version/s: | 10.1.29, 10.2.11, 10.3.3 |
| Fix Version/s: | 10.1.34, 10.2.16, 10.3.8 |
| Type: | Bug | Priority: | Major |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | ddl, performance, upstream | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Description |
|
ccalender noted in I tried to reproduce the issue with the following:
I believe that the trick is to fill up the buffer pool with many dirty pages, so that the buf_pool->LRU and buf_pool->flush_list will be long. The reason for innodb_page_size=4k is to get as many pages as possible into a smaller buffer pool. The other parameters are mainly there to speed up the insert (do as little file I/O as possible). I ran the test in /dev/shm. The DISCARD TABLESPACE were rather fast, maybe 0.1 seconds or less. While they were being run, sudo perf top showed the following:
Yes, we still have to scan the buf_pool->flush_list and remove the pages of any discarded tables, so that the discarded data will not be written, or that we will not get flooded by error messages that there are unwritten changes to missing files. I tried shortening the flush_list by adding the following before the CREATE TABLE loop:
Computing the page checksums for the flushing takes quite a bit of time, but to my surprise, the DISCARD TABLESPACE remained roughly as slow as before. I also tried skip_innodb_adaptive_hash_index (even though already with the previous fix, we would not touch the adaptive hash index). No change. Here is the culprit:
We are actually passing drop_ahi=true in DISCARD TABLESPACE, and calling the expensive function buf_LRU_drop_page_hash_for_tablespace() for no good reason. The adaptive hash index entries should do no harm here; they will have to be dropped on IMPORT TABLESPACE or DROP TABLE, but not earlier. It is also curious that we are searching for adaptive hash index entries to be dropped, even though skip_innodb_adaptive_hash_index is set. We should check if any AHI entries exist for any table (or even better, for any indexes of our table or partition), and only then invoke buf_LRU_drop_page_hash_for_tablespace(). |
| Comments |
| Comment by Marko Mäkelä [ 2018-05-29 ] | |
|
I don’t think that we need to drop the adaptive hash index at all on DISCARD TABLESPACE. However, we must fix DROP TABLE so that it will drop the entries, even if the tablespace has been discarded. On IMPORT TABLESPACE, since While developing the fix, I came across the code that emits messages like this:
I do not see any mechanism that would actively drop the adaptive hash index in this case. Because we are keeping the dictionary locked while waiting, the entire server could hang, and the InnoDB buffer pool could become idle, so that the entries are never dropped. I think that it is better to actively drop the adaptive hash index entries in this case. | |
| Comment by Marko Mäkelä [ 2018-05-30 ] | |
|
The original fix ( | |
| Comment by Tim He [ 2020-03-13 ] | |
|
--innodb-buffer-pool-size=16G ------------------------------------ 71.09 sec --innodb-buffer-pool-size=8G ------------------------------------ let $n=1000; while ($n) { eval alter table t$n discard tablespace; dec $n; } ------------------------------------ --innodb-buffer-pool-size=2G ------------------------------------ 16.24 sec --innodb-buffer-pool-size=256M ------------------------------------ let $n=1000; while ($n) { eval alter table t$n discard tablespace; dec $n; } ------------------------------------ --innodb-buffer-pool-size=128M ------------------------------------ | |
| Comment by Marko Mäkelä [ 2020-03-13 ] | |
|
Tim He, with which version of MariaDB is that? Can you post some analysis based on perf record and perf report? |