[MDEV-31956] SSD based InnoDB buffer pool extension Created: 2023-02-01 Updated: 2023-12-24 |
|
| Status: | Stalled |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Fix Version/s: | 11.5 |
| Type: | Task | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Vladislav Lesin |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | performance | ||
| Issue Links: |
|
||||||||
| Description |
|
In one of the practical cloud MariaDB setups, a server node accesses its datadir over the network, but also has a fast local SSD storage for temporary data. The content of such temporary storage is lost when the server container is destroyed. It could make sense to use this ephemeral fast local storage (SSD) as an extension of the portion of InnoDB buffer pool (DRAM) that caches persistent data pages. This cache would be separate from the persistent storage of data files and ib_logfile0. Such a local cache would avoid page reads and writes on slow network or HDD storage. On the persistent store, only ib_logfile0 would be written, until a write-back is needed for other reasons. We can simply treat the combination of the buffer pool and the local disk as one unit. Persistent pages could be freely moved between the local disk and the buffer pool. We must keep track of which pages in this combined "virtual buffer pool" are dirty, that is, will have to be written back to the persistent storage. Any durability guarantees would only apply to the persistent storage. |
| Comments |
| Comment by Max Mether [ 2023-02-01 ] |
|
Wouldn't the benefit of this local SSD only be to enlarge the size of the buffer pool? Another idea would be to have the REDO log on local SSD storage for fast writes but the actual data files on remote storage... ? |
| Comment by Jags (Inactive) [ 2023-02-01 ] |
|
maxmether you still cannot treat local SSD as persistent storage in this "cloud" scenario where containers are independently moved or stopped. You need all Log writes to be flushed to the network storage device. |
| Comment by Marko Mäkelä [ 2023-02-01 ] |
|
Right, it is the write-ahead log that determines what is durable. If we archived all log from the database creation to the current time, the data files could be reconstructed for any past point of time. |
| Comment by Max Mether [ 2023-02-01 ] |
|
My thinking was that by putting the WAL on the SSD we would significantly increase the write throughput (as the log entry of every commit has to be written to disk for the transaction to be durable). Of course if we cannot guarantee that the disk is durable then that of course doesn't work. |