[MDEV-20049] Create parameter to limit how many Page Store files can be created Created: 2019-07-11  Updated: 2023-04-06  Resolved: 2023-03-22

Status: Closed
Project: MariaDB Server
Component/s: Galera, Galera SST, wsrep
Fix Version/s: N/A

Type: Task Priority: Minor
Reporter: Manjot Singh (Inactive) Assignee: Manjot Singh (Inactive)
Resolution: Incomplete Votes: 1
Labels: None

Issue Links:
Relates

 Description   

Galera has two types of on-disk files to manage write-sets, a ring buffer file, and an on-demand page store. The ring-buffer size is controlled by gcache.size, but when a transaction write-set is large enough to exceed the size of the ring-buffer page ('gcache.page_size = 128M;'), then independent pages are allocated to cache write-sets.

An unusually large accumulation of on-demand page files can be caused by a combination of large transactions running on the cluster due to a data import or batch job at the time of an SST or if joiners are slow to process transactions.

What can happen then is the disk fills up with gcache pages and SST hangs indefinitely until manual intervention happens.

One thing that can save the disk from filling up in this case is to have a new variable that creates a global limit on size that can be written.



 Comments   
Comment by Geoff Montee (Inactive) [ 2019-07-11 ]

The default value of gcache.keep_pages_size is 0. This means that pages are deleted immediately after they are no longer needed. This seems like a great default value.

In the case that you mentioned, none of the write sets in the pages could be applied, since the node was still receiving its SST. Therefore, none of the pages could be deleted. This is why they kept getting created. I don't think changing gcache.keep_pages_size would help at all. If this value were increased, then it would just make Galera keep pages around after the write sets have been applied. That doesn't sound like it would help in the case that you mentioned.

For that specific case, it would probably make more sense to increase the value of gcache.size.

I've created an upstream FR for that here:

https://github.com/codership/galera/issues/543

Comment by Geoff Montee (Inactive) [ 2019-07-11 ]

One thing that can save the disk from filling up in this case is to have a new variable that creates a global limit on size that can be written.

The upstream FR for this is here: https://github.com/codership/galera/issues/544

Comment by Alexey [ 2020-11-04 ]

gcache.keep_page_size is not a solution here, it says how much of a cache in pages to keep. E.g. if you use gcache encryption then all cache is in pages and in that case gcache.keep_pages_size plays the same role as gcache.size.

To not fill the disk I can see two options:
1. gcs.gcs.recv_queue_hard_limit - it limits the total size of the slave queue - and usually it is a good approximation of how much disk space is used by Gcache. And with it any positive value of gcs.max_throttle will cause the node to abort. A 0.0 value will stop replication until the slave queue is reduced.

2. Create a dedicated GCache variable that would limit the total size occupied on disk. But then the question is what the node should do if that size is exceeded?
a. Abort right away
b. Pause replication until GCache volume decreases
c. Ignore all subsequent writesets, but keep the current queue and attempt to apply it, and abort afterwards

Notice that anything but pausing replication cluster-wide will not guarantee that a node will ever be able to join the cluster.

Comment by Manjot Singh (Inactive) [ 2020-11-04 ]

1 - good suggestions but don't fully solve the problem as a hard limit.

2. definitely b to match other galera cluster behaviors.

Comment by Manjot Singh (Inactive) [ 2022-01-05 ]

Yes, this is a good idea Yurchenko ralf.gebhardt@mariadb.com

Generated at Thu Feb 08 08:56:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.