[MDEV-24676] Concurrent multi-reader, multi-writer buffer for IO_CACHE - Jira

XML

Word

Printable

Details

Type: Task
Status: Open (View Workflow)
Priority: Minor
Resolution: Unresolved
Fix Version/s: None
Component/s: Replication
Labels:
- gsoc21

Description

IO_CACHE has basically three read/write modes: only read, only write, and a sequential read/write FIFO mode SEQ_READ_APPEND.

While some performance-sensitive places, like replication slave thread, use SEQ_READ_APPEND, that may be a bottleneck. since reads and writes are sequential (and co-sequential i.e. reads and writes block each other).

The task is to implement a non-blocking mode or multi-reader, multi-writer use-case through a concurrent ring buffer implementation.

Possible approaches

Lock-free n-consumer, m-producer ring buffer

This implementation requires limiting a number of simultaneous accessors and reserving slots for them.
Lock-free implementations can contain busy waits, but no locks, except when a number of consumers or producers is exceeded. This can be controlled by a semaphore with a capacity of a number of cores.
This is an ideal way, but can be an overhaul because of complicated busy loops and slot management.
This is also hard because writes can be bigger than a buffer. See buffer excess.

Simple rwlock-based non-blocking approach

The bottleneck basically occurs because SEQ_READ_APPEND blocks the whole time during buffer copy.
We can avoid it by moving the pointers first, thus allocating a place for copying, and then make a copy from/to the buffer without a lock.
rwlock will be used to access the pointers, i.e. reads access IO_CACHE::end_of_file with read lock to make ensure the borders, but writers access it with write lock.

Buffer excess

Excesses make things work sequential.
When the buffer is full, a separate write buffer is created. When the write buffer is full, a flush happens.
Flushes wait for all writers to finish first, then lock the write buffer for flushing.
The read buffer can be flushed in a more relaxed way: no need to need to lock for flushing, but we have to lock for buffer allocation and wait for all writers.
Waiting for writers can be done with another rwlock.

Single-readerness

The real-world cases are mostly single-consumer, and it is essential for IO_CACHE: it is variable-lengthed, and has no underlying data format,
so the reader always has to make at least two sequential reads (one to read size and another to read the body)

Single-readerness considerations can relax some conditions and ease the implementation

io_cache_reserve api

We can add a function to reserve the space to writing for the case of writing big objects (both bigger then the write cache
and smaller then the latter, but big enough to not fit to the external buffer), for the cases like copying one cache to another.

The function should return future-like object, since we have to notify IO_CACHE back that the writing is finished (to make flush for example)

Attachments

Issue Links

relates to

MDEV-19687 Create a circular queue for the IO_Thread events , so that worker can directly take events from queue

Stalled

links to

Request for a discusison: A fine-grained concurrent ring buffer mode for IO_CACHE

Activity

People

Assignee:: Nikita Malyavin

Reporter:: Nikita Malyavin

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2021-01-25 11:09

Updated:: 2023-08-15 08:16