Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9162

MariaDB Galera Cluster memory leak on async slave node

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 10.0.22-galera
    • Fix Version/s: 10.0.23-galera
    • Component/s: Galera, Replication
    • Labels:
      None
    • Environment:
      cat /etc/debian_version
      7.9
      root@node1:~# uname -a
      Linux node1 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u3 x86_64 GNU/Linux
    • Sprint:
      10.0.23

      Description

      An async slave node clearly leaks memory on some DDL statements coming from async master (also MariaDB 10).
      Steps to reproduce:
      – node4 is async master to node1, node1 and node2 are Galera cluster members. Run this simple query multiple times:

      root@node4:~# mysqlslap --delimiter=";" --number-of-queries=100000 --create-schema=test --query="CREATE TABLE IF NOT EXISTS xxxx (id int)"

      (...)

      – the more times the above query is executed, the bigger memory usage difference between the nodes and the more clear the memory leak is:

      node1 {root} ((none)) > select @@version,@@version_comment;
      +------------------------------------+-------------------------------------------------------+
      | @@version                          | @@version_comment                                     |
      +------------------------------------+-------------------------------------------------------+
      | 10.0.22-MariaDB-1~wheezy-wsrep-log | mariadb.org binary distribution, wsrep_25.11.r21a2415 |
      +------------------------------------+-------------------------------------------------------+
      1 row in set (0.00 sec)
      node1 {root} ((none)) > show status like 'ws%version';
      +------------------------+------------+
      | Variable_name          | Value      |
      +------------------------+------------+
      | wsrep_protocol_version | 7          |
      | wsrep_provider_version | 3.9(rXXXX) |
      +------------------------+------------+
      2 rows in set (0.00 sec)
      node1 {root} ((none)) > \! ps aux|egrep 'RSS|mysqld '|grep -v grep
      USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
      mysql    10140  2.0 66.2 2946844 1365228 ?     Sl   Nov21  10:18 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --wsrep_provider=/usr/lib/galera/libgalera_smm.so --log-error=/var/

      node1 {root} ((none)) > show global status like 'mem%';
      +---------------+-----------+
      | Variable_name | Value     |
      +---------------+-----------+
      | Memory_used   | 274914112 |
      +---------------+-----------+
      1 row in set (0.01 sec)

      – normal cluster node:

      node2 {root} ((none)) > \! ps aux|egrep 'RSS|mysqld '|grep -v grep
      USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
      mysql     6781  0.4 26.9 1066696 270304 pts/1  Sl   Nov19  16:29 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --wsrep_provider=/usr/lib/galera/libgalera_smm.so --log-error=/var/

      node2 {root} ((none)) > show global status like 'mem%';
      +---------------+-----------+
      | Variable_name | Value     |
      +---------------+-----------+
      | Memory_used   | 274792608 |
      +---------------+-----------+
      1 row in set (0.00 sec)

      – node1 shutdown sequence diagnostic output:

      151122  0:50:06 [Note] Slave I/O thread killed while reading event
      151122  0:50:06 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000005', position 34245035
      151122  0:50:06 [Note] WSREP: dtor state: CLOSED
      151122  0:50:06 [Note] WSREP: mon: entered 240000 oooe fraction 0 oool fraction 0
      151122  0:50:07 [Note] WSREP: mon: entered 240000 oooe fraction 0 oool fraction 0
      151122  0:50:07 [Note] WSREP: mon: entered 241908 oooe fraction 0 oool fraction 0
      151122  0:50:07 [Note] WSREP: cert index usage at exit 0
      151122  0:50:07 [Note] WSREP: cert trx map usage at exit 96
      151122  0:50:07 [Note] WSREP: deps set usage at exit 0
      151122  0:50:07 [Note] WSREP: avg deps dist 1
      151122  0:50:07 [Note] WSREP: avg cert interval 0
      151122  0:50:07 [Note] WSREP: cert index size 2
      151122  0:50:07 [Note] WSREP: Service thread queue flushed.
      151122  0:50:07 [Note] WSREP: wsdb trx map usage 240001 conn query map usage 1
      151122  0:50:07 [Note] WSREP: MemPool(LocalTrxHandle): hit ratio: 0.499799, misses: 240097, in use: 240001, in pool: 96
      151122  0:50:08 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0, misses: 0, in use: 0, in pool: 0
      151122  0:50:08 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 1008270)
      151122  0:50:08 [Note] WSREP: Flushing memory map to disk...
      151122  0:50:09 [Note] InnoDB: FTS optimize thread exiting.
      151122  0:50:09 [Note] InnoDB: Starting shutdown...
      151122  0:50:10 [Note] InnoDB: Shutdown completed; log sequence number 555064446
      151122  0:50:10 [Note] /usr/sbin/mysqld: Shutdown complete

      – note the "in use: 240001" structures above...
      – node2 shutdown diagnostics:

      151122  0:52:48 [Note] WSREP: dtor state: CLOSED
      151122  0:52:48 [Note] WSREP: mon: entered 816552 oooe fraction 0 oool fraction 0
      151122  0:52:48 [Note] WSREP: mon: entered 816552 oooe fraction 0 oool fraction 0
      151122  0:52:48 [Note] WSREP: mon: entered 805542 oooe fraction 0 oool fraction 1.2414e-06
      151122  0:52:48 [Note] WSREP: cert index usage at exit 0
      151122  0:52:48 [Note] WSREP: cert trx map usage at exit 0
      151122  0:52:48 [Note] WSREP: deps set usage at exit 0
      151122  0:52:48 [Note] WSREP: avg deps dist 1
      151122  0:52:48 [Note] WSREP: avg cert interval 0
      151122  0:52:48 [Note] WSREP: cert index size 2
      151122  0:52:48 [Note] WSREP: Service thread queue flushed.
      151122  0:52:48 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0
      151122  0:52:48 [Note] WSREP: MemPool(LocalTrxHandle): hit ratio: 0, misses: 0, in use: 0, in pool: 0
      151122  0:52:48 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0.999226, misses: 632, in use: 0, in pool: 632
      151122  0:52:48 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 1008270)

      I cannot reproduce the same leak on PXC 5.6.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              nirbhay_c Nirbhay Choubey (Inactive)
              Reporter:
              przemek@mysqlmaniac.com Przemek
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: