[MDEV-31492] Provide a way to "pin" InnoDB table in memory - Jira

Details

Type: New Feature
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: None
Component/s: Storage Engine - InnoDB
Labels:
None

Description

Some systems with a lot of data (data size on disk is notably larger than the amount of RAM available) need a way to make sure that specific table remains in the buffer pool even when some other, bigger nut less important tables are accessed at times. I see two ways to implement this kind of feature:

1. Provide a way to create several named buffer pools of different sizes and then specify in CREATE or ALTER TABLE what pool to use (with default unnamed one used by default).

2. Provide a way to "pin" table to memory somehow via DDL, for example with a clause like STORAGE (BUFFER_POOL_KEEP) in Oracle RDBMS and some way to define what share of the buffer pool is to use for KEEP pool.

No matter how is this implemented, we need some table (INFORMATION_SCHEMA.TABLES or some other) where we can find out if the table is "pinned" to memory or specific buffer pool.

See also https://bugs.mysql.com/bug.php?id=68544 for a similar feature request.

Attachments

Issue Links

relates to

MDEV-11378 AliSQL: [Perf] Issue#23 MERGE INNODB AIO REQUEST

Open

MDEV-32067 InnoDB linear read ahead had better be logical

Confirmed

Activity

Ascending order - Click to sort in descending order

View 4 older comments

Marko Mäkelä added a comment - 2023-10-16 13:27

Another idea: Could the desired effect be achieved by having a SET SESSION parameter that would specify an interface like nice(1) or ionice(1) for a rarely executed low-priority query that would tend to pollute the buffer pool? The impact of that would be that the “recently used” status of pages accessed would not change, and any pages that had to be loaded into the buffer pool would count as “not accessed” and not trigger any read-ahead. This could of course be implemented independently of the “pinning” feature.

Marko Mäkelä added a comment - 2023-10-16 13:27 Another idea: Could the desired effect be achieved by having a SET SESSION parameter that would specify an interface like nice(1) or ionice(1) for a rarely executed low-priority query that would tend to pollute the buffer pool? The impact of that would be that the “recently used” status of pages accessed would not change, and any pages that had to be loaded into the buffer pool would count as “not accessed” and not trigger any read-ahead. This could of course be implemented independently of the “pinning” feature.

Larry Adams added a comment - 2023-10-16 21:59 - edited

Certainly status reporting would be important like total size, percent utilized and percent used by each table, maybe an info schema table for pinned table details. Then a simple way to dynamically add or remove tables from the pinned pool.

Larry Adams added a comment - 2023-10-16 21:59 - edited Certainly status reporting would be important like total size, percent utilized and percent used by each table, maybe an info schema table for pinned table details. Then a simple way to dynamically add or remove tables from the pinned pool.

Larry Adams added a comment - 2023-10-16 22:02

Movement to and from the pinned pool should be dynamic and in background as a result of an alter (assuming it's an alter).

Larry Adams added a comment - 2023-10-16 22:02 Movement to and from the pinned pool should be dynamic and in background as a result of an alter (assuming it's an alter).

markus makela added a comment - 2024-08-06 04:58

This would also make it possible for proxies like MaxScale to provide a QoS type of feature where some queries are prioritized lower than others by prepending something like SET STATEMENT innodb_skip_lru=ON to the SQL string. If MDEV-33724 was to be implemeted that doesn't add too much overhead, I'd imagine that queries which "pollute" the buffer pool could be automatically detected and then later on prepended with the SQL that skips putting them at the head of the LRU list.

markus makela added a comment - 2024-08-06 04:58 This would also make it possible for proxies like MaxScale to provide a QoS type of feature where some queries are prioritized lower than others by prepending something like SET STATEMENT innodb_skip_lru=ON to the SQL string. If MDEV-33724 was to be implemeted that doesn't add too much overhead, I'd imagine that queries which "pollute" the buffer pool could be automatically detected and then later on prepended with the SQL that skips putting them at the head of the LRU list.

Marko Mäkelä added a comment - 2025-04-17 10:51 - edited

One more idea that I just touched in a discussion with monty is that when executing a low-priority table scan, we might skip any buffer pool allocation and simply read the data pages into a dedicated buffer that would be reused over and over, without involving any buf_pool.mutex. I think that we must still consult buf_pool.page_hash to see if the page of interest resides in the buffer pool already. If not, we will be able to find the most recent data in the file system. This is also somewhat related to MDEV-11378 and MDEV-32067.

I think that anything that involves modifying data needs to keep using the normal buffer pool and buf_flush_page_cleaner() for the write-back.

Once again: prodiving a way to distinguish low-priority or bulk queries is an alternative way to improve the chances that high-priority smaller tables will remain in the buffer pool.

Marko Mäkelä added a comment - 2025-04-17 10:51 - edited One more idea that I just touched in a discussion with monty is that when executing a low-priority table scan, we might skip any buffer pool allocation and simply read the data pages into a dedicated buffer that would be reused over and over, without involving any buf_pool.mutex . I think that we must still consult buf_pool.page_hash to see if the page of interest resides in the buffer pool already. If not, we will be able to find the most recent data in the file system. This is also somewhat related to MDEV-11378 and MDEV-32067 . I think that anything that involves modifying data needs to keep using the normal buffer pool and buf_flush_page_cleaner() for the write-back. Once again: prodiving a way to distinguish low-priority or bulk queries is an alternative way to improve the chances that high-priority smaller tables will remain in the buffer pool.

MariaDB Server

Provide a way to "pin" InnoDB table in memory

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration