Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17247

Memory leak on parallel replicated slave

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.3.8
    • 10.4(EOL)
    • Replication
    • None
    • Host - CoreOS
      Docker Container - centos:7
      Using RPM from MariaDB Repo

    Description

      [ISSUE]:
      I have been encountering what appears to be an unbounded memory leak on parallel replicated slaves running 10.3.8. Everything will be steady and memory usage will seem completely stable for weeks. It seems that it is caused by a specific workload/type of query but I have not been able to identify what. It does always occur during heavy load on the master, but not every time the master is loaded. When the problem does manifest, the memory usage grows very rapidly. It does not occur on the master they are replicating from. Restarting MariaDB is the only solution to free up memory I have found so far. I have attached the statuses and variables.

      [BELLOW IS NO LONGER RELEVANT, LEFT FOR GOOGLE SEARCH HITS]:
      In an attempt to debug what is going on I have compiled MariaDB with the BUILD/compile-amd64-valgrind-max script in docker. I have attached the Dockerfile to show the build process. I start valgrind and mariadb with the following command:

      valgrind --tool=massif --massif-out-file=/var/log/mysql/massif.out.%p /usr/local/mysql/bin/mysqld

      However, when I shutdown MariaDB with a signal or with the SHUTDOWN command, valgrind errors and does not output a massif file. I have attached that error log as well. It seems that MariaDB maybe built wrong? I'm not sure what I'm doing wrong and am reaching the end of my debugging abilities. I attached a log called compile_install that the output from the build. I'm not sure what to look for in it though.

      Another night that might factor in is that I built the docker image on a host system running "Linux 4.18.7-arch1-1-ARCH" and am attempting to run it on a host running "Linux 4.14.67-coreos"

      Attachments

        1. variables_redacted
          18 kB
        2. valgrind_error
          4 kB
        3. status
          13 kB
        4. snapshot_1538417420
          282 kB
        5. snapshot_1537963221
          487 kB
        6. snapshot_1537811421
          529 kB
        7. Screen Shot 2018-09-26 at 3.35.03 PM.png
          Screen Shot 2018-09-26 at 3.35.03 PM.png
          75 kB
        8. global_variables_redacted
          18 kB
        9. global_status
          13 kB
        10. Dockerfile
          2 kB
        11. crash_log
          14 kB
        12. compile_install
          2.01 MB

        Issue Links

          Activity

            People

              Elkin Andrei Elkin
              ljolly Luke Jolly
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.