[MDEV-13328] ALTER TABLE ... DISCARD TABLESPACE takes a lot of time with large buffer pool (>128G) - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.1.20
Fix Version/s: 10.1.29, 10.2.11, 10.3.3
Component/s: Storage Engine - InnoDB
Labels:
None

Sprint:
10.1.29

Description

ALTER TABLE ... DISCARD TABLESPACE takes a lot of time with large buffer pool (>128G).

Steps to reproduce:

1. drop / create an InnoDB table
2. discard the table space
3. increase the buffer pool size.
4. Re-try once you have a 128G+ buffer pool, and re-run #1 & #2.

The discard time should be increasing.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

perf_10g.data
316 kB
2017-09-08 04:52
perf_5g.data
510 kB
2017-09-08 04:52

Issue Links

causes

MDEV-16759 InnoDB: Assertion failure in thread 139946191502080 in file row0ins.cc line 285

Closed

is part of

MDEV-16283 ALTER TABLE...DISCARD TABLESPACE still takes long on a large buffer pool

Closed

relates to

MDEV-14263 FLUSH TABLES FOR EXPORT may write the same pages multiple times

Confirmed

MDEV-9459 Truncate table causes innodb stalls

Closed

MDEV-14310 Possible corruption by table-rebuilding or index-creating ALTER TABLE…ALGORITHM=INPLACE

Closed

Sub-Tasks

1.

Smoke test / sanity check for changes made in scope of MDEV-13328

Closed

Elena Stepanova

Activity

Ascending order - Click to sort in descending order

View 10 older comments

Marko Mäkelä added a comment - 2017-11-03 06:58

If DISCARD TABLESPACE does not evict the pages from the buffer pool, then IMPORT TABLESPACE must do it. It would not be a useful fix to simply move the call to the IMPORT TABLESPACE step.

I tried simply removing the eviction in DISCARD TABLESPACE. If the DISCARD is soon followed by IMPORT, the IMPORT would adjust the imported file to the pre-existing tablespace ID. If the old pages were not evicted from the buffer pool at DISCARD, then after IMPORT we could incorrectly read old pages from the buffer pool, instead of reading the pages that exist in the imported file.

Import is initially bypassing the buffer pool when reading pages for the adjustment phase. A possible fix is that in the adjustment phase, if a page exists in the buffer pool, we would replace it with the page from the imported file. This would allow quick DISCARD TABLESPACE even for large tables, and a slightly slower IMPORT TABLESPACE (the slowness being proportional to the file size).

Marko Mäkelä added a comment - 2017-11-03 06:58 If DISCARD TABLESPACE does not evict the pages from the buffer pool, then IMPORT TABLESPACE must do it. It would not be a useful fix to simply move the call to the IMPORT TABLESPACE step. I tried simply removing the eviction in DISCARD TABLESPACE. If the DISCARD is soon followed by IMPORT, the IMPORT would adjust the imported file to the pre-existing tablespace ID. If the old pages were not evicted from the buffer pool at DISCARD, then after IMPORT we could incorrectly read old pages from the buffer pool, instead of reading the pages that exist in the imported file. Import is initially bypassing the buffer pool when reading pages for the adjustment phase. A possible fix is that in the adjustment phase, if a page exists in the buffer pool, we would replace it with the page from the imported file. This would allow quick DISCARD TABLESPACE even for large tables, and a slightly slower IMPORT TABLESPACE (the slowness being proportional to the file size).

Marko Mäkelä added a comment - 2017-11-06 21:11

I have pushed this fix to 10.0 and merged to 10.1 so far.
I spent some time merging this to 10.2.

In 10.2, there is a merge conflict for the TRUNCATE TABLE code, as anticipated.
And the solution ought to be good news for TRUNCATE performance: There should be no need to evict adaptive hash index entries or old pages from the buffer pool during TRUNCATE. I believe that it suffices to ensure that buf_page_create() or equivalent is being called when initializing new pages after TRUNCATE, and that TRUNCATE will edit the dict_index_t objects in place.

Marko Mäkelä added a comment - 2017-11-06 21:11 I have pushed this fix to 10.0 and merged to 10.1 so far. I spent some time merging this to 10.2. In 10.2, there is a merge conflict for the TRUNCATE TABLE code, as anticipated. And the solution ought to be good news for TRUNCATE performance: There should be no need to evict adaptive hash index entries or old pages from the buffer pool during TRUNCATE. I believe that it suffices to ensure that buf_page_create() or equivalent is being called when initializing new pages after TRUNCATE, and that TRUNCATE will edit the dict_index_t objects in place.

Chris Calender (Inactive) added a comment - 2018-05-22 17:50

I need to re-open this one.

After testing with 10.1.29, we are still seeing slowness with 512G buffer pool size:

MariaDB [test]> drop table if exists t;

Query OK, 0 rows affected, 1 warning (0.00 sec)

MariaDB [test]> create table t(id int) engine=InnoDB;

Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> alter table t discard tablespace;

Query OK, 0 rows affected (7.34 sec)

MariaDB [test]> drop table if exists t;

Query OK, 0 rows affected (0.22 sec)

MariaDB [test]> select @@version;

+-----------------+

| @@version |

+-----------------+

| 10.1.29-MariaDB |

+-----------------+

1 row in set (0.01 sec)

MariaDB [test]> select @@innodb_buffer_pool_size;

+---------------------------+

| @@innodb_buffer_pool_size |

+---------------------------+

| 549755813888 |

+---------------------------+

1 row in set (0.00 sec)

Is there anything further that can be done here?

Chris Calender (Inactive) added a comment - 2018-05-22 17:50 I need to re-open this one. After testing with 10.1.29, we are still seeing slowness with 512G buffer pool size: MariaDB [test]> drop table if exists t; Query OK, 0 rows affected, 1 warning (0.00 sec) MariaDB [test]> create table t(id int) engine=InnoDB; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> alter table t discard tablespace; Query OK, 0 rows affected (7.34 sec) MariaDB [test]> drop table if exists t; Query OK, 0 rows affected (0.22 sec) MariaDB [test]> select @@version; +-----------------+ | @@version | +-----------------+ | 10.1.29-MariaDB | +-----------------+ 1 row in set (0.01 sec) MariaDB [test]> select @@innodb_buffer_pool_size; +---------------------------+ | @@innodb_buffer_pool_size | +---------------------------+ | 549755813888 | +---------------------------+ 1 row in set (0.00 sec) Is there anything further that can be done here?

Marko Mäkelä added a comment - 2018-05-24 12:22

ccalender, I filed ~~MDEV-16283~~ for what I believe to be the remaining issue. With the smaller buffer pool that I used, the DISCARD TABLESPACE was still rather fast.

Let us track the progress in ~~MDEV-16283~~.

Marko Mäkelä added a comment - 2018-05-24 12:22 ccalender , I filed MDEV-16283 for what I believe to be the remaining issue. With the smaller buffer pool that I used, the DISCARD TABLESPACE was still rather fast. Let us track the progress in MDEV-16283 .

Marko Mäkelä added a comment - 2018-05-30 07:16

The original fix (~~MDEV-13328~~) was that we would not evict the pages of the discarded tablespace from the buffer pool; we would do that on a subsequent IMPORT TABLESPACE only. But, we would still scan the buffer pool for adaptive hash index entries, to drop them. It actually suffices to drop the adaptive hash index entries on IMPORT TABLESPACE or on DROP TABLE or DROP INDEX. And there is no need to scan the buffer pool if index->search_info->ref_count==0 for all indexes of the table. ~~MDEV-16283~~ implemented these fixes.

Marko Mäkelä added a comment - 2018-05-30 07:16 The original fix ( MDEV-13328 ) was that we would not evict the pages of the discarded tablespace from the buffer pool; we would do that on a subsequent IMPORT TABLESPACE only. But, we would still scan the buffer pool for adaptive hash index entries, to drop them. It actually suffices to drop the adaptive hash index entries on IMPORT TABLESPACE or on DROP TABLE or DROP INDEX . And there is no need to scan the buffer pool if index->search_info->ref_count==0 for all indexes of the table. MDEV-16283 implemented these fixes.

MariaDB Server

ALTER TABLE ... DISCARD TABLESPACE takes a lot of time with large buffer pool (>128G)

Details

Description

Attachments

Attachments

Issue Links

Sub-Tasks

Activity

People

Dates

Git Integration