Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-466

Galera: Excessive memory use on sizable transactions

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Not a Bug
    • None
    • 5.5.25-galera
    • None
    • None

    Description

      I execute the following set of statements 3 times in a row:

        CREATE TABLE t1 ( a INT NOT NULL AUTO_INCREMENT PRIMARY KEY, b INT NOT NULL) ;
        INSERT INTO t1 (b) VALUES (1),(2),(3),(4),(5),(6),(7),(8);
        INSERT INTO t1 (b) SELECT t1a.b FROM t1 t1a, t1 t1b, t1 t1c, t1 t1d, t1 t1e, t1 t1f LIMIT 131064;
        SELECT COUNT(*) FROM t1;
        DROP TABLE t1;

      wait for innodb to settle (for checkpoint age to become 0), and check memory consumption after each round.

      Here are results for a server started in Galera context (as a new node):

      Server options:

      --innodb --default-storage-engine=innodb --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=0 --binlog-format=row --log-bin=master-bin --wsrep_provider=/home/elenst/galera-23.2.1-src/libgalera_smm.so --wsrep_cluster_address=gcomm://?'gmcast.listen_addr=tcp://127.0.0.1:4567'

        PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
      # Start
      17380 elenst    20   0  863m 345m 9.8m S    0  4.3   0:00.47 mysqld
      # After the 1st round
      17380 elenst    20   0 1060m 603m  14m S    0  7.5   0:35.33 mysqld
      # After the 2nd round
      17380 elenst    20   0 1124m 648m  18m S    0  8.1   1:05.34 mysqld
      # After the 3rd round
      17380 elenst    20   0 1126m 659m  21m S    1  8.2   1:35.49 mysqld

      So, we started and ended at the same point (dropped the table that we previously created and populated), however memory consumption has grown ~260 Mb

      Nothing like that happens if the server is run in a standalone mode (with otherwise the same parameters):

      Server options:

      --innodb --default-storage-engine=innodb --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --innodb_flush_log_at_trx_commit=0 --innodb_doublewrite=0 --binlog-format=row --log-bin=master-bin 

        PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
      # Start
      17336 elenst    20   0  611m 318m 7468 S    0  4.0   0:02.41 mysqld 
      # After the 1st round
      17336 elenst    20   0  615m 329m 9268 S    0  4.1   0:08.40 mysqld 
      # After the 2nd round
      17336 elenst    20   0  615m 334m 9316 S    0  4.2   0:14.28 mysqld
      # After the 3rd round
      17336 elenst    20   0  615m 340m 9316 S    0  4.3   0:20.05 mysqld

      bzr version-info:

      revision-id: seppo.jaakola@codership.com-20120808224721-89h8nxmfk9uvp708
      date: 2012-08-09 01:47:21 +0300
      build-date: 2012-08-20 07:00:22 +0400
      revno: 3343

      Attachments

        Activity

          seppo Seppo Jaakola added a comment -

          Galera maintains "certification index" to help in tracking dependencies between transactions. This dependency information makes it possible e.g. to enable parallel applying with row level granularity in slave side.
          This test will cause several table and row level entries to be appended in the certification index, and this probably accounts for the observed memory image growth. Galera does periodical pruning for the index, but it will be delayed process and does not show here.
          To test for memory leaks, it makes sense to run long term SQL load and monitor that memory image will saturate at certain level. Or alternatively run under valgrind.

          seppo Seppo Jaakola added a comment - Galera maintains "certification index" to help in tracking dependencies between transactions. This dependency information makes it possible e.g. to enable parallel applying with row level granularity in slave side. This test will cause several table and row level entries to be appended in the certification index, and this probably accounts for the observed memory image growth. Galera does periodical pruning for the index, but it will be delayed process and does not show here. To test for memory leaks, it makes sense to run long term SQL load and monitor that memory image will saturate at certain level. Or alternatively run under valgrind.
          seppo Seppo Jaakola added a comment -

          Looks like false positive, long term test shows no trails of memory leak

          seppo Seppo Jaakola added a comment - Looks like false positive, long term test shows no trails of memory leak

          >> Looks like false positive, long term test shows no trails of memory leak

          This is because the bug was not about a memory leak, otherwise it would have said so. It was about excessive memory use: a modest operation (I do consider 128K rows is a rather modest number in the real world), which normally does not take anything above what the server already occupies, in Galera context requires additional 250+ Mb, even if it's only temporarily; and that's on top of extra 250 Mb which it takes just to start an empty node.
          If you don't consider it a problem, so be it, but I can say right away, based on complaints we'd heard before about any slightly increased footprint, that users are not going to like it unless it's configurable and can be avoided or reasonably and explicitly limited (which it might still be, and I just don't know about it).

          elenst Elena Stepanova added a comment - >> Looks like false positive, long term test shows no trails of memory leak This is because the bug was not about a memory leak, otherwise it would have said so. It was about excessive memory use: a modest operation (I do consider 128K rows is a rather modest number in the real world), which normally does not take anything above what the server already occupies, in Galera context requires additional 250+ Mb, even if it's only temporarily; and that's on top of extra 250 Mb which it takes just to start an empty node. If you don't consider it a problem, so be it, but I can say right away, based on complaints we'd heard before about any slightly increased footprint, that users are not going to like it unless it's configurable and can be avoided or reasonably and explicitly limited (which it might still be, and I just don't know about it).

          A fancy corollary: probably we won't even be able to run standard MTR test on a server with wsrep.

          Example (the original test case is main.multi_update2, it is not applicable as it is with $1=19, because it uses 2M rows while the hard limit is 1M rows. So, we reduce it to $1=18 and $1=17, below are the numbers of the comparative test)

          with wsrep
          ------------------

          MTR command-line options:
          -mysqld=tmp_table_size=1024 --mysqld=innodb --mysqld=default-storage-engine=innodb --mysqld=innodb_autoinc_lock_mode=2 --mysqld=innodb_locks_unsafe_for_binlog=1 --mysqld=innodb_flush_log_at_trx_commit=0 --mysqld=innodb_doublewrite=0 --mysqld=binlog-format=row --mysqld=log-bin=master-bin --mysqld=wsrep_provider=/home/elenst/galera-23.2.1-src/libgalera_smm.so --mysqld=wsrep_cluster_address=gcomm://?'gmcast.listen_addr=tcp://127.0.0.1:4567' --mysqld=wsrep_max_ws_size=4G --mysqld=-wsrep_max_ws_rows=1M --testcase-timeout=300

          $1=17 (500K rows)
          Execution time: 558 sec, grows up to 2 Gb

          457 elenst 20 0 557m 235m 12m S 34 2.9 0:03.44 mysqld
          ...
          457 elenst 20 0 2035m 1.7g 57m S 127 22.3 8:39.71 mysqld

          $1=18 (1M rows)
          Execution time: 938 sec, grows up to 4 Gb

          866 elenst 20 0 606m 247m 12m S 77 3.1 0:07.67 mysqld
          ...
          866 elenst 20 0 3932m 3.5g 94m S 200 45.2 14:35.75 mysqld

          Same build, without wsrep
          ------------------

          MTR command-line options (all the same minus wsrep):
          -mysqld=tmp_table_size=1024 --mysqld=innodb --mysqld=default-storage-engine=innodb --mysqld=innodb_autoinc_lock_mode=2 --mysqld=innodb_locks_unsafe_for_binlog=1 --mysqld=innodb_flush_log_at_trx_commit=0 --mysqld=innodb_doublewrite=0 --mysqld=binlog-format=row --mysqld=-log-bin=master-bin --testcase-timeout=300

          $1 = 17 (500K rows)
          Execution time: 299 sec, stays inside initial 400+ Mb

          1093 elenst 20 0 403m 195m 6140 S 4 2.4 0:00.37 mysqld
          ...
          1093 elenst 20 0 283m 85m 10m S 100 1.1 4:36.88 mysqld

          $1 = 18 (1M rows)
          Execution time: 465 sec, stays inside initial 400+ Mb

          981 elenst 20 0 403m 195m 6140 S 3 2.4 0:00.34 mysqld
          ...
          981 elenst 20 0 404m 209m 10m S 147 2.6 7:01.88 mysqld

          elenst Elena Stepanova added a comment - A fancy corollary: probably we won't even be able to run standard MTR test on a server with wsrep. Example (the original test case is main.multi_update2, it is not applicable as it is with $1=19, because it uses 2M rows while the hard limit is 1M rows. So, we reduce it to $1=18 and $1=17, below are the numbers of the comparative test) with wsrep ------------------ MTR command-line options: - mysqld= tmp_table_size=1024 --mysqld= innodb --mysqld= default-storage-engine=innodb --mysqld= innodb_autoinc_lock_mode=2 --mysqld= innodb_locks_unsafe_for_binlog=1 --mysqld= innodb_flush_log_at_trx_commit=0 --mysqld= innodb_doublewrite=0 --mysqld= binlog-format=row --mysqld= log-bin=master-bin --mysqld= wsrep_provider=/home/elenst/galera-23.2.1-src/libgalera_smm.so --mysqld= wsrep_cluster_address=gcomm://?'gmcast.listen_addr=tcp://127.0.0.1:4567' --mysqld= wsrep_max_ws_size=4G --mysqld= -wsrep_max_ws_rows=1M --testcase-timeout=300 $1=17 (500K rows) Execution time: 558 sec, grows up to 2 Gb 457 elenst 20 0 557m 235m 12m S 34 2.9 0:03.44 mysqld ... 457 elenst 20 0 2035m 1.7g 57m S 127 22.3 8:39.71 mysqld $1=18 (1M rows) Execution time: 938 sec, grows up to 4 Gb 866 elenst 20 0 606m 247m 12m S 77 3.1 0:07.67 mysqld ... 866 elenst 20 0 3932m 3.5g 94m S 200 45.2 14:35.75 mysqld Same build, without wsrep ------------------ MTR command-line options (all the same minus wsrep): - mysqld= tmp_table_size=1024 --mysqld= innodb --mysqld= default-storage-engine=innodb --mysqld= innodb_autoinc_lock_mode=2 --mysqld= innodb_locks_unsafe_for_binlog=1 --mysqld= innodb_flush_log_at_trx_commit=0 --mysqld= innodb_doublewrite=0 --mysqld= binlog-format=row --mysqld= -log-bin=master-bin --testcase-timeout=300 $1 = 17 (500K rows) Execution time: 299 sec, stays inside initial 400+ Mb 1093 elenst 20 0 403m 195m 6140 S 4 2.4 0:00.37 mysqld ... 1093 elenst 20 0 283m 85m 10m S 100 1.1 4:36.88 mysqld $1 = 18 (1M rows) Execution time: 465 sec, stays inside initial 400+ Mb 981 elenst 20 0 403m 195m 6140 S 3 2.4 0:00.34 mysqld ... 981 elenst 20 0 404m 209m 10m S 147 2.6 7:01.88 mysqld

          People

            seppo Seppo Jaakola
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.