[MDEV-29147] full text index causes memory leak Created: 2022-07-21  Updated: 2023-02-15  Resolved: 2023-01-03

Status: Closed
Project: MariaDB Server
Component/s: Galera, Storage Engine - InnoDB
Affects Version/s: 10.6.8
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Richard Stracke Assignee: Denis Protivensky
Resolution: Cannot Reproduce Votes: 1
Labels: Memory_leak

Attachments: PNG File Mem_usage_1.png     PNG File Screenshot from 2022-09-16 14-12-09.png     File galera_mysqlslap.test     File mem.info     PNG File memory_usage.png    

 Description   

Complex queries with fulltext index cause memory leak over time.

with innodb_prefix_index_cluster_optimization=On memory consumption/ leak
is slow down, so it can maybe related to MDEV-28730



 Comments   
Comment by Elena Stepanova [ 2022-07-25 ]

We'll need something reproducible for this.

Comment by Richard Stracke [ 2022-08-05 ]

elenst

Finally I can reproduce it.

1. Create a galera cluster

2. Create a table with fulltext index

3. fire inserts and a select on the column with fulltext index and "order by" on the same time

mysqlslap -uroot -psecret -h127.0.0.1 --delimiter=";" --create="CREATE TABLE t1 (v1 VARCHAR(1024),FULLTEXT INDEX f1 (v1) );" --query="SELECT * from t1  WHERE MATCH (v1) AGAINST ('\"Lybf7W\"' IN BOOLEAN MODE) order by v1 desc;INSERT INTO t1 VALUES (LEFT(MD5(RAND()),30));" --concurrency=100 --iterations=5000

Comment by Elena Stepanova [ 2022-08-05 ]

ramesh, could you please verify and assign as needed?

Comment by Ramesh Sivaraman [ 2022-08-06 ]

Reproduced the issue using the given test case.
Memory usage info during transaction.

vagrant@node1:~$ for i in `seq 1 100`; do
> ps -o pid,user,%mem,command ax | grep mariadb | grep -v grep
> sleep 5 
> done
1329323 mysql    55.0 /usr/sbin/mariadbd --wsrep-new-cluster --wsrep_start_position=18cd447b-1569-11ed-b7cd-629408725dcd:903346,7-8-903325
1329323 mysql    57.4 /usr/sbin/mariadbd --wsrep-new-cluster --wsrep_start_position=18cd447b-1569-11ed-b7cd-629408725dcd:903346,7-8-903325
[..]
1329323 mysql    64.1 /usr/sbin/mariadbd --wsrep-new-cluster --wsrep_start_position=18cd447b-1569-11ed-b7cd-629408725dcd:903346,7-8-903325
1329323 mysql    64.4 /usr/sbin/mariadbd --wsrep-new-cluster --wsrep_start_position=18cd447b-1569-11ed-b7cd-629408725dcd:903346,7-8-903325
[..]
1329323 mysql    81.7 /usr/sbin/mariadbd --wsrep-new-cluster --wsrep_start_position=18cd447b-1569-11ed-b7cd-629408725dcd:903346,7-8-903325
1329323 mysql    81.7 /usr/sbin/mariadbd --wsrep-new-cluster --wsrep_start_position=18cd447b-1569-11ed-b7cd-629408725dcd:903346,7-8-903325
vagrant@node1:~$ 

Comment by Jan Lindström (Inactive) [ 2022-09-13 ]

After few rounds of mtr with attached test case, I could not get any memory leak reports from valgrind. Furthermore, based on code review there should not be anything special wsrep handling on FTS (expect on SST). Therefore, if there is a memory leak it should be repeatable without Galera.

Comment by Ramesh Sivaraman [ 2022-09-13 ]

jplindst When testing Galera the buffer_pool configuration on my local VM box was not noticed, the increased memory usage was not due to INSERT/SELECT on FTS but due to higher buffer_pool size. Non-Galera server also showing increased memory usage with higher buffer_pool size. Richard Can you please check if memory usage exceeds buffer_pool size when running INSERT/SELECT on FTS on your local box?

Comment by Marko Mäkelä [ 2022-09-14 ]

jplindst, can you please provide a way to reproduce this bug without using Galera?

Comment by Marko Mäkelä [ 2022-09-14 ]

Side note: In MDEV-28540 (10.10), a bug in innodb_prefix_index_cluster_optimization (which was added in MDEV-6929) was fixed. That fix was not applied to earlier versions yet.

Comment by Marko Mäkelä [ 2022-09-14 ]

Richard, I removed the link to MDEV-28370 because that bug is about a leak of some data dictionary metadata on DDL crash recovery. The scenario of this bug does not appear to involve any recovery.

Comment by Marko Mäkelä [ 2022-09-14 ]

Richard, which type of a server build did you use? In galera_mysqlslap.test there are no configuration parameters mentioned.

Sometimes memory leaks are confused with internal memory fragmentation in the allocator. I would suggest to try a different allocator, or configuring the allocator in the C runtime library. For GNU/Linux, see man mallopt or try replacing the GNU libc malloc() with jemalloc or tcmalloc. At least the latter includes a heap profiler.

I would also suggest trying this without Galera, to confirm if the observed memory usage is Galera related.

Comment by Daniel Black [ 2022-09-16 ]

The bitnami image doesn't include jemalloc however the Docker Library images do. So either:

  • Base an image off bitnami, RUN apt-get update && apt-get install libjemalloc2
  • Use Docker Library image with paths changed

Note without bitnami symbols, resolution might be hard: https://github.com/bitnami/containers/issues/6290

General form running jemalloc is using LD_PRELOAD and MALLOC_CONF env variable in containers as mysqld_safe isn't used:

$ podman run -ti --env MARIADB_ROOT_PASSWORD=maria2016 \
   -v /tmp/d://docker-entrypoint-initdb.d:z --name m106 \
  --env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 \
  --env MALLOC_CONF=prof:true,prof_leak:true,lg_prof_sample:19,prof_final:true,stats_print:true  -d --rm  quay.io/mariadb-foundation/mariadb-devel:10.6  --innodb-buffer-pool-size=10M
1749de638430baf620bee3c6fd200ed7b1d86a372611986b63954324d2b57c67

I'm not an expert in MALLOC_CONF, I just copied from https://access.redhat.com/articles/6817071, a more detailed look acheive less noisy results.

Comment by Denis Protivensky [ 2022-12-19 ]

Test setup: upstream 10.6 branch (dd5f4b3625d); ASAN-enabled build; default Galera settings; test scenario with mysqlslap.

Conclusion: Memory leak is not confirmed with Galera, although memory growth speed is higher up to some point in time, and the memory usage stabilization takes longer compared to non-Galera server run.

Explanation: `galera.cache` ring buffer cache file of 128M size (by default) is mmap-ed into memory, and while it's getting filled with data, more physical pages are allocated by the OS, resulting in some notable memory growth. After some time, when the cache file is filled up, the memory usage of the server stabilizes and its behavior becomes identical to the memory usage of the server running without Galera.

Generated at Thu Feb 08 10:06:16 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.