[MDEV-31492] Provide a way to "pin" InnoDB table in memory Created: 2023-06-17 Updated: 2023-11-30 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major |
| Reporter: | Valerii Kravchuk | Assignee: | Marko Mäkelä |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Description |
|
Some systems with a lot of data (data size on disk is notably larger than the amount of RAM available) need a way to make sure that specific table remains in the buffer pool even when some other, bigger nut less important tables are accessed at times. I see two ways to implement this kind of feature: 1. Provide a way to create several named buffer pools of different sizes and then specify in CREATE or ALTER TABLE what pool to use (with default unnamed one used by default). 2. Provide a way to "pin" table to memory somehow via DDL, for example with a clause like STORAGE (BUFFER_POOL_KEEP) in Oracle RDBMS and some way to define what share of the buffer pool is to use for KEEP pool. No matter how is this implemented, we need some table (INFORMATION_SCHEMA.TABLES or some other) where we can find out if the table is "pinned" to memory or specific buffer pool. See also https://bugs.mysql.com/bug.php?id=68544 for a similar feature request. |
| Comments |
| Comment by Larry Adams [ 2023-06-20 ] |
|
Just need a way to keep some tables from being able to be swapped out of memory once there. We have high update rates that cause the tables to be pushed from memory and by doing so, impacts performance. |
| Comment by Marko Mäkelä [ 2023-06-26 ] |
|
I believe that the named buffer pools should merely be logical names, and at the low level there should be only a single buffer pool. At the data structure level, the buf_pool.LRU list could be partly replaced by one list per named buffer pool, so that we can easily evict all tables of a named pool, or move them to the traditional buf_pool.LRU list. How would impossible constraints be dealt with?
It might be simplest to not completely prevent eviction of pages, but merely assign some priorities for eviction. Ideas for this would be welcome. |
| Comment by Larry Adams [ 2023-06-26 ] |
|
Certainly an 1114 type errors would not be wanted, but as with MAX_HEAP_TABLE_SIZE, there should be some way to measure how much (via show status and show variables), and a warning that goes into the MariaDB log or the SQL response to say that said pool is full and a table `ABC` had to be moved into the lower priority pool to maintain integrity would not be so bad. Or even a threshold being breached, say 80% full, generating warnings somewhere. As an aside, one of the things that I have loved about MariaDB/MySQL was the whole concept of the storage engine. When I met the PostgreSQL guys at Linux world years ago and asked why they couldn't simply use the ENGINE tag to create a memory table, they quickly turned to creating memory file partitions and placing the tables in that storage partition. Was so sad. Anyway, a decade and a half ago now (2007). I would hope that you could use a MariaDB'ism similar to ROW_FORMAT to bias the table towards the pinned pool or something. Just thinking out loud here... I would say just practice Keep It Simple Silly (KISS) adage, but I understand that it may not be in the end. |
| Comment by Marko Mäkelä [ 2023-10-16 ] |
|
Right, we would probably not want any errors like ER_RECORD_FILE_FULL (1114) to be returned. Would the following work?
The InnoDB implementation might redefine buf_page_t::state() so that pinned pages can be identified. I do not have a good idea for an efficient data structure or algorithm for keeping track of the number of share of pinned pages. A worst-case-inefficient algorithm could be to run a second scan of buf_pool.LRU when all eligible victims in the first scan are pinned. |
| Comment by Marko Mäkelä [ 2023-10-16 ] |
|
Another idea: Could the desired effect be achieved by having a SET SESSION parameter that would specify an interface like nice(1) or ionice(1) for a rarely executed low-priority query that would tend to pollute the buffer pool? The impact of that would be that the “recently used” status of pages accessed would not change, and any pages that had to be loaded into the buffer pool would count as “not accessed” and not trigger any read-ahead. This could of course be implemented independently of the “pinning” feature. |
| Comment by Larry Adams [ 2023-10-16 ] |
|
Certainly status reporting would be important like total size, percent utilized and percent used by each table, maybe an info schema table for pinned table details. Then a simple way to dynamically add or remove tables from the pinned pool. |
| Comment by Larry Adams [ 2023-10-16 ] |
|
Movement to and from the pinned pool should be dynamic and in background as a result of an alter (assuming it's an alter). |