[MDEV-466] Galera: Excessive memory use on sizable transactions Created: 2012-08-20  Updated: 2012-08-20  Resolved: 2012-08-20

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.5.25-galera

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Seppo Jaakola
Resolution: Not a Bug Votes: 0
Labels: None

Issue Links:
Relates

 Description   

I execute the following set of statements 3 times in a row:

  CREATE TABLE t1 ( a INT NOT NULL AUTO_INCREMENT PRIMARY KEY, b INT NOT NULL) ;
  INSERT INTO t1 (b) VALUES (1),(2),(3),(4),(5),(6),(7),(8);
  INSERT INTO t1 (b) SELECT t1a.b FROM t1 t1a, t1 t1b, t1 t1c, t1 t1d, t1 t1e, t1 t1f LIMIT 131064;
  SELECT COUNT(*) FROM t1;
  DROP TABLE t1;

wait for innodb to settle (for checkpoint age to become 0), and check memory consumption after each round.

Here are results for a server started in Galera context (as a new node):

Server options:

--innodb --default-storage-engine=innodb --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=0 --binlog-format=row --log-bin=master-bin --wsrep_provider=/home/elenst/galera-23.2.1-src/libgalera_smm.so --wsrep_cluster_address=gcomm://?'gmcast.listen_addr=tcp://127.0.0.1:4567'

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
# Start
17380 elenst    20   0  863m 345m 9.8m S    0  4.3   0:00.47 mysqld
# After the 1st round
17380 elenst    20   0 1060m 603m  14m S    0  7.5   0:35.33 mysqld
# After the 2nd round
17380 elenst    20   0 1124m 648m  18m S    0  8.1   1:05.34 mysqld
# After the 3rd round
17380 elenst    20   0 1126m 659m  21m S    1  8.2   1:35.49 mysqld

So, we started and ended at the same point (dropped the table that we previously created and populated), however memory consumption has grown ~260 Mb

Nothing like that happens if the server is run in a standalone mode (with otherwise the same parameters):

Server options:

--innodb --default-storage-engine=innodb --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=0 --binlog-format=row --log-bin=master-bin 

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
# Start
17336 elenst    20   0  611m 318m 7468 S    0  4.0   0:02.41 mysqld 
# After the 1st round
17336 elenst    20   0  615m 329m 9268 S    0  4.1   0:08.40 mysqld 
# After the 2nd round
17336 elenst    20   0  615m 334m 9316 S    0  4.2   0:14.28 mysqld
# After the 3rd round
17336 elenst    20   0  615m 340m 9316 S    0  4.3   0:20.05 mysqld

bzr version-info:

revision-id: seppo.jaakola@codership.com-20120808224721-89h8nxmfk9uvp708
date: 2012-08-09 01:47:21 +0300
build-date: 2012-08-20 07:00:22 +0400
revno: 3343



 Comments   
Comment by Seppo Jaakola [ 2012-08-20 ]

Galera maintains "certification index" to help in tracking dependencies between transactions. This dependency information makes it possible e.g. to enable parallel applying with row level granularity in slave side.
This test will cause several table and row level entries to be appended in the certification index, and this probably accounts for the observed memory image growth. Galera does periodical pruning for the index, but it will be delayed process and does not show here.
To test for memory leaks, it makes sense to run long term SQL load and monitor that memory image will saturate at certain level. Or alternatively run under valgrind.

Comment by Seppo Jaakola [ 2012-08-20 ]

Looks like false positive, long term test shows no trails of memory leak

Comment by Elena Stepanova [ 2012-08-20 ]

>> Looks like false positive, long term test shows no trails of memory leak

This is because the bug was not about a memory leak, otherwise it would have said so. It was about excessive memory use: a modest operation (I do consider 128K rows is a rather modest number in the real world), which normally does not take anything above what the server already occupies, in Galera context requires additional 250+ Mb, even if it's only temporarily; and that's on top of extra 250 Mb which it takes just to start an empty node.
If you don't consider it a problem, so be it, but I can say right away, based on complaints we'd heard before about any slightly increased footprint, that users are not going to like it unless it's configurable and can be avoided or reasonably and explicitly limited (which it might still be, and I just don't know about it).

Comment by Elena Stepanova [ 2012-08-20 ]

A fancy corollary: probably we won't even be able to run standard MTR test on a server with wsrep.

Example (the original test case is main.multi_update2, it is not applicable as it is with $1=19, because it uses 2M rows while the hard limit is 1M rows. So, we reduce it to $1=18 and $1=17, below are the numbers of the comparative test)

with wsrep
------------------

MTR command-line options:
-mysqld=tmp_table_size=1024 --mysqld=innodb --mysqld=default-storage-engine=innodb --mysqld=innodb_autoinc_lock_mode=2 --mysqld=innodb_locks_unsafe_for_binlog=1 --mysqld=innodb_flush_log_at_trx_commit=0 --mysqld=innodb_doublewrite=0 --mysqld=binlog-format=row --mysqld=log-bin=master-bin --mysqld=wsrep_provider=/home/elenst/galera-23.2.1-src/libgalera_smm.so --mysqld=wsrep_cluster_address=gcomm://?'gmcast.listen_addr=tcp://127.0.0.1:4567' --mysqld=wsrep_max_ws_size=4G --mysqld=-wsrep_max_ws_rows=1M --testcase-timeout=300

$1=17 (500K rows)
Execution time: 558 sec, grows up to 2 Gb

457 elenst 20 0 557m 235m 12m S 34 2.9 0:03.44 mysqld
...
457 elenst 20 0 2035m 1.7g 57m S 127 22.3 8:39.71 mysqld

$1=18 (1M rows)
Execution time: 938 sec, grows up to 4 Gb

866 elenst 20 0 606m 247m 12m S 77 3.1 0:07.67 mysqld
...
866 elenst 20 0 3932m 3.5g 94m S 200 45.2 14:35.75 mysqld

Same build, without wsrep
------------------

MTR command-line options (all the same minus wsrep):
-mysqld=tmp_table_size=1024 --mysqld=innodb --mysqld=default-storage-engine=innodb --mysqld=innodb_autoinc_lock_mode=2 --mysqld=innodb_locks_unsafe_for_binlog=1 --mysqld=innodb_flush_log_at_trx_commit=0 --mysqld=innodb_doublewrite=0 --mysqld=binlog-format=row --mysqld=-log-bin=master-bin --testcase-timeout=300

$1 = 17 (500K rows)
Execution time: 299 sec, stays inside initial 400+ Mb

1093 elenst 20 0 403m 195m 6140 S 4 2.4 0:00.37 mysqld
...
1093 elenst 20 0 283m 85m 10m S 100 1.1 4:36.88 mysqld

$1 = 18 (1M rows)
Execution time: 465 sec, stays inside initial 400+ Mb

981 elenst 20 0 403m 195m 6140 S 3 2.4 0:00.34 mysqld
...
981 elenst 20 0 404m 209m 10m S 147 2.6 7:01.88 mysqld

Generated at Thu Feb 08 06:28:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.