[MDEV-31956] SSD based InnoDB buffer pool extension Created: 2023-02-01  Updated: 2023-12-24

Status: Stalled
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Fix Version/s: 11.5

Type: Task Priority: Critical
Reporter: Marko Mäkelä Assignee: Vladislav Lesin
Resolution: Unresolved Votes: 0
Labels: performance

Issue Links:
Relates
relates to MDEV-26055 Adaptive flushing is still not gettin... Closed

 Description   

In one of the practical cloud MariaDB setups, a server node accesses its datadir over the network, but also has a fast local SSD storage for temporary data. The content of such temporary storage is lost when the server container is destroyed.

It could make sense to use this ephemeral fast local storage (SSD) as an extension of the portion of InnoDB buffer pool (DRAM) that caches persistent data pages. This cache would be separate from the persistent storage of data files and ib_logfile0.

Such a local cache would avoid page reads and writes on slow network or HDD storage. On the persistent store, only ib_logfile0 would be written, until a write-back is needed for other reasons.

We can simply treat the combination of the buffer pool and the local disk as one unit. Persistent pages could be freely moved between the local disk and the buffer pool. We must keep track of which pages in this combined "virtual buffer pool" are dirty, that is, will have to be written back to the persistent storage. Any durability guarantees would only apply to the persistent storage.



 Comments   
Comment by Max Mether [ 2023-02-01 ]

Wouldn't the benefit of this local SSD only be to enlarge the size of the buffer pool?

Another idea would be to have the REDO log on local SSD storage for fast writes but the actual data files on remote storage... ?

Comment by Jags (Inactive) [ 2023-02-01 ]

maxmether you still cannot treat local SSD as persistent storage in this "cloud" scenario where containers are independently moved or stopped. You need all Log writes to be flushed to the network storage device.

Comment by Marko Mäkelä [ 2023-02-01 ]

Right, it is the write-ahead log that determines what is durable. If we archived all log from the database creation to the current time, the data files could be reconstructed for any past point of time.

Comment by Max Mether [ 2023-02-01 ]

My thinking was that by putting the WAL on the SSD we would significantly increase the write throughput (as the log entry of every commit has to be written to disk for the transaction to be durable). Of course if we cannot guarantee that the disk is durable then that of course doesn't work.

Generated at Thu Feb 08 10:27:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.