[MDEV-6822] gcache.page files are not deleted Created: 2014-10-01  Updated: 2023-08-14  Resolved: 2015-03-12

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.0.12-galera
Fix Version/s: 10.0.17-galera

Type: Bug Priority: Major
Reporter: Philipp Kraut Assignee: Nirbhay Choubey (Inactive)
Resolution: Fixed Votes: 2
Labels: galera
Environment:

Server: Dell 720xd, SSD Raid 10, 128 GB RAM, Gigabit link
OS: Debian 7
Nodes: Three
DB Version: MariaDB 10.0.12 Galera Bundle
wsrep provider: 25.3.5-wheezy(rXXXX)


Attachments: File node1.cnf     File node1.errors.log_8308.20141113_2.txt.gz     File node1.log     File node2.cnf     File node2.errors.log_8308.20141113_2.txt.gz     File node2.log     File node3.cnf     File node3.log    
Issue Links:
Relates
relates to MDEV-31843 gcache page files are not deleted Open

 Description   

Hello all,

we are using MariaDB-Galera-Cluster 10.0.12 and we ran (again) into the problem that lots of gcache.page-files are created but not deleted.
I'am wondering why we have gcache.page-files at all because we set gcache.size to 50G! However, this problem eats up all disk space within a few days.

Node 1:

  1. ls -lh gcache*
    rw------ 1 mysql mysql 256M Sep 24 08:00 gcache.page.000080

Node 2:

  1. ls -lh gcache*
    rw------ 1 mysql mysql 256M Sep 21 12:22 gcache.page.000000
    rw------ 1 mysql mysql 256M Sep 21 13:25 gcache.page.000001
    rw------ 1 mysql mysql 256M Sep 21 14:39 gcache.page.000002
    rw------ 1 mysql mysql 256M Sep 21 15:55 gcache.page.000003
    rw------ 1 mysql mysql 256M Sep 21 16:59 gcache.page.000004
    rw------ 1 mysql mysql 256M Sep 21 18:03 gcache.page.000005
    rw------ 1 mysql mysql 256M Sep 21 19:07 gcache.page.000006
    rw------ 1 mysql mysql 256M Sep 21 20:03 gcache.page.000007
    rw------ 1 mysql mysql 256M Sep 21 21:01 gcache.page.000008
    rw------ 1 mysql mysql 256M Sep 21 22:04 gcache.page.000009
    rw------ 1 mysql mysql 256M Sep 21 23:23 gcache.page.000010
    rw------ 1 mysql mysql 256M Sep 22 00:43 gcache.page.000011
    rw------ 1 mysql mysql 256M Sep 22 02:22 gcache.page.000012
    rw------ 1 mysql mysql 256M Sep 22 04:19 gcache.page.000013
    rw------ 1 mysql mysql 256M Sep 22 06:14 gcache.page.000014
    rw------ 1 mysql mysql 256M Sep 22 07:29 gcache.page.000015
    rw------ 1 mysql mysql 256M Sep 22 08:44 gcache.page.000016
    rw------ 1 mysql mysql 256M Sep 22 09:55 gcache.page.000017
    rw------ 1 mysql mysql 256M Sep 22 10:56 gcache.page.000018
    rw------ 1 mysql mysql 256M Sep 22 11:57 gcache.page.000019
    rw------ 1 mysql mysql 256M Sep 22 12:54 gcache.page.000020
    rw------ 1 mysql mysql 256M Sep 22 13:56 gcache.page.000021
    rw------ 1 mysql mysql 256M Sep 22 14:59 gcache.page.000022
    rw------ 1 mysql mysql 256M Sep 22 16:02 gcache.page.000023
    rw------ 1 mysql mysql 256M Sep 22 17:07 gcache.page.000024
    rw------ 1 mysql mysql 256M Sep 22 18:04 gcache.page.000025
    rw------ 1 mysql mysql 256M Sep 22 19:15 gcache.page.000026
    rw------ 1 mysql mysql 256M Sep 22 20:26 gcache.page.000027
    rw------ 1 mysql mysql 256M Sep 22 21:36 gcache.page.000028
    rw------ 1 mysql mysql 256M Sep 22 22:51 gcache.page.000029
    rw------ 1 mysql mysql 256M Sep 23 00:38 gcache.page.000030
    rw------ 1 mysql mysql 256M Sep 23 02:32 gcache.page.000031
    rw------ 1 mysql mysql 256M Sep 23 04:54 gcache.page.000032
    rw------ 1 mysql mysql 256M Sep 23 06:52 gcache.page.000033
    rw------ 1 mysql mysql 256M Sep 23 08:00 gcache.page.000034
    rw------ 1 mysql mysql 256M Sep 23 09:14 gcache.page.000035
    rw------ 1 mysql mysql 256M Sep 23 10:19 gcache.page.000036
    rw------ 1 mysql mysql 256M Sep 23 11:29 gcache.page.000037
    rw------ 1 mysql mysql 256M Sep 23 12:34 gcache.page.000038
    rw------ 1 mysql mysql 256M Sep 23 13:43 gcache.page.000039
    rw------ 1 mysql mysql 256M Sep 23 14:48 gcache.page.000040
    rw------ 1 mysql mysql 256M Sep 23 15:49 gcache.page.000041
    rw------ 1 mysql mysql 256M Sep 23 16:53 gcache.page.000042
    rw------ 1 mysql mysql 256M Sep 23 17:56 gcache.page.000043
    rw------ 1 mysql mysql 256M Sep 23 19:04 gcache.page.000044
    rw------ 1 mysql mysql 256M Sep 23 20:08 gcache.page.000045
    rw------ 1 mysql mysql 256M Sep 23 21:11 gcache.page.000046
    rw------ 1 mysql mysql 256M Sep 23 22:26 gcache.page.000047
    rw------ 1 mysql mysql 256M Sep 23 23:30 gcache.page.000048
    rw------ 1 mysql mysql 256M Sep 24 00:53 gcache.page.000049
    rw------ 1 mysql mysql 256M Sep 24 02:34 gcache.page.000050
    rw------ 1 mysql mysql 256M Sep 24 04:29 gcache.page.000051
    rw------ 1 mysql mysql 256M Sep 24 06:14 gcache.page.000052
    rw------ 1 mysql mysql 256M Sep 24 07:28 gcache.page.000053
    rw------ 1 mysql mysql 256M Sep 24 07:57 gcache.page.000054

Node 3:

  1. ls -ls gcache*
    262148 rw------ 1 mysql mysql 256M Sep 24 01:29 gcache.page.000000
    262120 rw------ 1 mysql mysql 256M Sep 24 03:22 gcache.page.000001
    262148 rw------ 1 mysql mysql 256M Sep 24 05:14 gcache.page.000002
    262148 rw------ 1 mysql mysql 256M Sep 24 06:49 gcache.page.000003
    262148 rw------ 1 mysql mysql 256M Sep 24 07:53 gcache.page.000004
    262148 rw------ 1 mysql mysql 256M Sep 24 08:01 gcache.page.000005

So all this started suddenly on Sep 21. We haven't changed anything that day and you can't find something unusual in the error logs. (attached)
This happend one time before, we "fixed" the problem by stopping the servers successively and deleting the gcache.page-files physically. This worked for about two weeks and now the problem ist back.
Even more confusing: In contrast to Node 2 and 3, Node 1 creates the gcache.page-files (whatever the reason is) and deletes it again after filling it up..

All this happens even due to very low load times (at night).

I also attatched the my.cnf we use.
I reported this problem at codership too but i haven't received a response until now. (https://groups.google.com/forum/?hl=de#!topic/codership-team/1OKXmpzfmwc)

I would be very thankful for any help regarding this issue.

Thank you and best wishes
Philipp



 Comments   
Comment by Philipp Kraut [ 2014-10-14 ]

For your info, when stopping mariadb i get this error message:
141014 6:23:36 [ERROR] WSREP: Could not delete 373 page files: some buffers are still "mmapped".

Comment by Nirbhay Choubey (Inactive) [ 2014-10-21 ]

p.kraut Can you share the full sever log for all the nodes?

Comment by Philipp Kraut [ 2014-10-22 ]

Sure, no problem. I attached the full logs and the current my.cnf files.
In the meantime we bootstrapped the cluster again and removed the cache.files. The problem occurred again on all the nodes as you can see in the logs.
Please note:

  • We used node1 to bootstrap the cluster. node1 is also donor for node2 and node3.
  • We have a lot of deadlocks on node1 because we run some (sometimes heavy) maintenance on that node. We always had these scripts running and we never had a problem until now.
  • We reduced gcache.size to 5 GB since i reported this issue. (with no effect)
Comment by Stoykov (Inactive) [ 2014-11-14 ]

I am reporting the same issue with MariaDB Server, wsrep_25.10.r4014 Galera 25.3.5(r178), two real nodes and one garbd instance, the writes go to the first node only.
Plarform : Red Hat Enterprise Linux

We noticed that the amount of Data of the node 2 was bigger than the node 1.
 
After some investigations, we found out that it was due to the gcache.
 
It seems that they couldn't be deleted since yesterday on the first node, and couldn't be deleted on the second one today.
 
NODE 2:
home/databases/mysql/mysql-1/data]# ll |grep gcache.page |wc -l
75
ll -rth |grep gcache.page |tail -n2
-rw------- 1 mysql-1 mysql-1 128M Nov 13 14:34 gcache.page.000073
-rw------- 1 mysql-1 mysql-1 128M Nov 13 14:35 gcache.page.000074
 
NODE 1:
:/home/databases/mysql/mysql-1/data]# ll -rth |grep gcache.page |wc -l
37
:/home/databases/mysql/mysql-1/data]# ll -rth |grep gcache.page |tail -n2
-rw------- 1 mysql-1 mysql-1 128M Nov 12 16:15 gcache.page.000035
-rw------- 1 mysql-1 mysql-1 128M Nov 12 19:22 gcache.page.000036
 
And we found the following error on the node 2 earlier when we tried to stop the mysql :
41113 14:35:15 [ERROR] WSREP: Could not delete 75 page files: some buffers are still "mmapped".

Comment by Stoykov (Inactive) [ 2014-11-14 ]

Log files from the two nodes reported at #comment-65354

Comment by Bjoern Boschman [ 2015-02-12 ]

seems not related to mariadb:
this issue also applies to percona-xtradb-cluster:

percona-xtrabackup 2.2.8-5059-1.wheezy
percona-xtradb-cluster-full-56 5.6.21-25.8-938.wheezy
percona-xtradb-cluster-galera-3 3.8.3390.wheezy

so it seems to be common galera related issue.
it only happens on my "primary" write node

Comment by Bjoern Boschman [ 2015-02-12 ]

btw - did anyone tried playing with gcache-keep-pages-size as suggested here:
http://www.percona.com/forums/questions-discussions/percona-xtradb-cluster/18776-gcache-page-some_number-full-disk-problems

Comment by Philipp Kraut [ 2015-02-12 ]

Hello Bjoern,

i did not know the gcache-keep-pages-size variable and it is not mentioned in our config, but a "show variables like 'wsrep_provider_options';" reveals that it is set to 0 by default.

We had to restart the complete cluster this morning because we ran out of space again.

Regards,
Philipp

Comment by Nirbhay Choubey (Inactive) [ 2015-03-12 ]

https://github.com/codership/galera/issues/229

Comment by Philipp Kraut [ 2015-05-13 ]

I can confirm that this issue is fixed in 10.0.17. (wsrep 3.9)

Comment by COUNOTTE CEDRIC [ 2022-08-01 ]

I can confirm the same issue (symptoms) occur on 10.6.7!

485GB of galera.page files are present on one node, 128MB on the others!

Comment by Roel van Meer [ 2023-06-26 ]

Although this is originally an old issue: we're experiencing the same symptoms with Debian 11, with mariadb-server 10.5.19-0+deb11u2 and galera-4 26.4.11-0+deb11u1, on the first node of a 3 node cluster.

Comment by Khai Ping [ 2023-08-04 ]

can we reopen this issue? Seems like gcache.page files are not deleted. Created a new jira here. https://jira.mariadb.org/browse/MDEV-31843

Generated at Thu Feb 08 07:14:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.