[MDEV-466] Galera: Excessive memory use on sizable transactions - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Not a Bug
Affects Version/s: None
Fix Version/s: 5.5.25-galera
Component/s: None
Labels:
None

Description

I execute the following set of statements 3 times in a row:

  CREATE TABLE t1 ( a INT NOT NULL AUTO_INCREMENT PRIMARY KEY, b INT NOT NULL) ;

  INSERT INTO t1 (b) VALUES (1),(2),(3),(4),(5),(6),(7),(8);

  INSERT INTO t1 (b) SELECT t1a.b FROM t1 t1a, t1 t1b, t1 t1c, t1 t1d, t1 t1e, t1 t1f LIMIT 131064;

  SELECT COUNT(*) FROM t1;

  DROP TABLE t1;

wait for innodb to settle (for checkpoint age to become 0), and check memory consumption after each round.

Here are results for a server started in Galera context (as a new node):

Server options:

--innodb --default-storage-engine=innodb --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=0 --binlog-format=row --log-bin=master-bin --wsrep_provider=/home/elenst/galera-23.2.1-src/libgalera_smm.so --wsrep_cluster_address=gcomm://?'gmcast.listen_addr=tcp://127.0.0.1:4567'

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

# Start

17380 elenst    20   0  863m 345m 9.8m S    0  4.3   0:00.47 mysqld

# After the 1st round

17380 elenst    20   0 1060m 603m  14m S    0  7.5   0:35.33 mysqld

# After the 2nd round

17380 elenst    20   0 1124m 648m  18m S    0  8.1   1:05.34 mysqld

# After the 3rd round

17380 elenst    20   0 1126m 659m  21m S    1  8.2   1:35.49 mysqld

So, we started and ended at the same point (dropped the table that we previously created and populated), however memory consumption has grown ~260 Mb

Nothing like that happens if the server is run in a standalone mode (with otherwise the same parameters):

Server options:

--innodb --default-storage-engine=innodb --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=0 --binlog-format=row --log-bin=master-bin

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

# Start

17336 elenst    20   0  611m 318m 7468 S    0  4.0   0:02.41 mysqld

# After the 1st round

17336 elenst    20   0  615m 329m 9268 S    0  4.1   0:08.40 mysqld

# After the 2nd round

17336 elenst    20   0  615m 334m 9316 S    0  4.2   0:14.28 mysqld

# After the 3rd round

17336 elenst    20   0  615m 340m 9316 S    0  4.3   0:20.05 mysqld

bzr version-info:

revision-id: seppo.jaakola@codership.com-20120808224721-89h8nxmfk9uvp708

date: 2012-08-09 01:47:21 +0300

build-date: 2012-08-20 07:00:22 +0400

revno: 3343

Attachments

Activity

Ascending order - Click to sort in descending order

Seppo Jaakola added a comment - 2012-08-20 11:04

Galera maintains "certification index" to help in tracking dependencies between transactions. This dependency information makes it possible e.g. to enable parallel applying with row level granularity in slave side.
This test will cause several table and row level entries to be appended in the certification index, and this probably accounts for the observed memory image growth. Galera does periodical pruning for the index, but it will be delayed process and does not show here.
To test for memory leaks, it makes sense to run long term SQL load and monitor that memory image will saturate at certain level. Or alternatively run under valgrind.

Seppo Jaakola added a comment - 2012-08-20 11:04 Galera maintains "certification index" to help in tracking dependencies between transactions. This dependency information makes it possible e.g. to enable parallel applying with row level granularity in slave side. This test will cause several table and row level entries to be appended in the certification index, and this probably accounts for the observed memory image growth. Galera does periodical pruning for the index, but it will be delayed process and does not show here. To test for memory leaks, it makes sense to run long term SQL load and monitor that memory image will saturate at certain level. Or alternatively run under valgrind.

Seppo Jaakola added a comment - 2012-08-20 12:11

Looks like false positive, long term test shows no trails of memory leak

Seppo Jaakola added a comment - 2012-08-20 12:11 Looks like false positive, long term test shows no trails of memory leak

Elena Stepanova added a comment - 2012-08-20 13:54

>> Looks like false positive, long term test shows no trails of memory leak

This is because the bug was not about a memory leak, otherwise it would have said so. It was about excessive memory use: a modest operation (I do consider 128K rows is a rather modest number in the real world), which normally does not take anything above what the server already occupies, in Galera context requires additional 250+ Mb, even if it's only temporarily; and that's on top of extra 250 Mb which it takes just to start an empty node.
If you don't consider it a problem, so be it, but I can say right away, based on complaints we'd heard before about any slightly increased footprint, that users are not going to like it unless it's configurable and can be avoided or reasonably and explicitly limited (which it might still be, and I just don't know about it).

Elena Stepanova added a comment - 2012-08-20 13:54 >> Looks like false positive, long term test shows no trails of memory leak This is because the bug was not about a memory leak, otherwise it would have said so. It was about excessive memory use: a modest operation (I do consider 128K rows is a rather modest number in the real world), which normally does not take anything above what the server already occupies, in Galera context requires additional 250+ Mb, even if it's only temporarily; and that's on top of extra 250 Mb which it takes just to start an empty node. If you don't consider it a problem, so be it, but I can say right away, based on complaints we'd heard before about any slightly increased footprint, that users are not going to like it unless it's configurable and can be avoided or reasonably and explicitly limited (which it might still be, and I just don't know about it).

Elena Stepanova added a comment - 2012-08-20 23:40

A fancy corollary: probably we won't even be able to run standard MTR test on a server with wsrep.

Example (the original test case is main.multi_update2, it is not applicable as it is with $1=19, because it uses 2M rows while the hard limit is 1M rows. So, we reduce it to $1=18 and $1=17, below are the numbers of the comparative test)

with wsrep
------------------

MTR command-line options:
-~~mysqld=tmp_table_size=1024 --mysqld=innodb --mysqld=default-storage-engine=innodb --mysqld=innodb_autoinc_lock_mode=2 --mysqld=innodb_locks_unsafe_for_binlog=1 --mysqld=innodb_flush_log_at_trx_commit=0 --mysqld=innodb_doublewrite=0 --mysqld=binlog-format=row --mysqld=log-bin=master-bin --mysqld=wsrep_provider=/home/elenst/galera-23.2.1-src/libgalera_smm.so --mysqld=wsrep_cluster_address=gcomm://?'gmcast.listen_addr=tcp://127.0.0.1:4567' --mysqld=wsrep_max_ws_size=4G --mysqld=~~-wsrep_max_ws_rows=1M --testcase-timeout=300

$1=17 (500K rows)
Execution time: 558 sec, grows up to 2 Gb

457 elenst 20 0 557m 235m 12m S 34 2.9 0:03.44 mysqld
...
457 elenst 20 0 2035m 1.7g 57m S 127 22.3 8:39.71 mysqld

$1=18 (1M rows)
Execution time: 938 sec, grows up to 4 Gb

866 elenst 20 0 606m 247m 12m S 77 3.1 0:07.67 mysqld
...
866 elenst 20 0 3932m 3.5g 94m S 200 45.2 14:35.75 mysqld

Same build, without wsrep
------------------

MTR command-line options (all the same minus wsrep):
-~~mysqld=tmp_table_size=1024 --mysqld=innodb --mysqld=default-storage-engine=innodb --mysqld=innodb_autoinc_lock_mode=2 --mysqld=innodb_locks_unsafe_for_binlog=1 --mysqld=innodb_flush_log_at_trx_commit=0 --mysqld=innodb_doublewrite=0 --mysqld=binlog-format=row --mysqld=~~-log-bin=master-bin --testcase-timeout=300

$1 = 17 (500K rows)
Execution time: 299 sec, stays inside initial 400+ Mb

1093 elenst 20 0 403m 195m 6140 S 4 2.4 0:00.37 mysqld
...
1093 elenst 20 0 283m 85m 10m S 100 1.1 4:36.88 mysqld

$1 = 18 (1M rows)
Execution time: 465 sec, stays inside initial 400+ Mb

981 elenst 20 0 403m 195m 6140 S 3 2.4 0:00.34 mysqld
...
981 elenst 20 0 404m 209m 10m S 147 2.6 7:01.88 mysqld

Elena Stepanova added a comment - 2012-08-20 23:40 A fancy corollary: probably we won't even be able to run standard MTR test on a server with wsrep. Example (the original test case is main.multi_update2, it is not applicable as it is with $1=19, because it uses 2M rows while the hard limit is 1M rows. So, we reduce it to $1=18 and $1=17, below are the numbers of the comparative test) with wsrep ------------------ MTR command-line options: - mysqld= tmp_table_size=1024 --mysqld= innodb --mysqld= default-storage-engine=innodb --mysqld= innodb_autoinc_lock_mode=2 --mysqld= innodb_locks_unsafe_for_binlog=1 --mysqld= innodb_flush_log_at_trx_commit=0 --mysqld= innodb_doublewrite=0 --mysqld= binlog-format=row --mysqld= log-bin=master-bin --mysqld= wsrep_provider=/home/elenst/galera-23.2.1-src/libgalera_smm.so --mysqld= wsrep_cluster_address=gcomm://?'gmcast.listen_addr=tcp://127.0.0.1:4567' --mysqld= wsrep_max_ws_size=4G --mysqld= -wsrep_max_ws_rows=1M --testcase-timeout=300 $1=17 (500K rows) Execution time: 558 sec, grows up to 2 Gb 457 elenst 20 0 557m 235m 12m S 34 2.9 0:03.44 mysqld ... 457 elenst 20 0 2035m 1.7g 57m S 127 22.3 8:39.71 mysqld $1=18 (1M rows) Execution time: 938 sec, grows up to 4 Gb 866 elenst 20 0 606m 247m 12m S 77 3.1 0:07.67 mysqld ... 866 elenst 20 0 3932m 3.5g 94m S 200 45.2 14:35.75 mysqld Same build, without wsrep ------------------ MTR command-line options (all the same minus wsrep): - mysqld= tmp_table_size=1024 --mysqld= innodb --mysqld= default-storage-engine=innodb --mysqld= innodb_autoinc_lock_mode=2 --mysqld= innodb_locks_unsafe_for_binlog=1 --mysqld= innodb_flush_log_at_trx_commit=0 --mysqld= innodb_doublewrite=0 --mysqld= binlog-format=row --mysqld= -log-bin=master-bin --testcase-timeout=300 $1 = 17 (500K rows) Execution time: 299 sec, stays inside initial 400+ Mb 1093 elenst 20 0 403m 195m 6140 S 4 2.4 0:00.37 mysqld ... 1093 elenst 20 0 283m 85m 10m S 100 1.1 4:36.88 mysqld $1 = 18 (1M rows) Execution time: 465 sec, stays inside initial 400+ Mb 981 elenst 20 0 403m 195m 6140 S 3 2.4 0:00.34 mysqld ... 981 elenst 20 0 404m 209m 10m S 147 2.6 7:01.88 mysqld

People

Assignee:: Seppo Jaakola

Reporter:: Elena Stepanova

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2012-08-20 06:02

Updated:: 2012-08-20 23:40

Resolved:: 2012-08-20 12:11

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server