Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17247

Memory leak on parallel replicated slave

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.3.8
    • Fix Version/s: 10.3
    • Component/s: Replication
    • Labels:
      None
    • Environment:
      Host - CoreOS
      Docker Container - centos:7
      Using RPM from MariaDB Repo

      Description

      [ISSUE]:
      I have been encountering what appears to be an unbounded memory leak on parallel replicated slaves running 10.3.8. Everything will be steady and memory usage will seem completely stable for weeks. It seems that it is caused by a specific workload/type of query but I have not been able to identify what. It does always occur during heavy load on the master, but not every time the master is loaded. When the problem does manifest, the memory usage grows very rapidly. It does not occur on the master they are replicating from. Restarting MariaDB is the only solution to free up memory I have found so far. I have attached the statuses and variables.

      [BELLOW IS NO LONGER RELEVANT, LEFT FOR GOOGLE SEARCH HITS]:
      In an attempt to debug what is going on I have compiled MariaDB with the BUILD/compile-amd64-valgrind-max script in docker. I have attached the Dockerfile to show the build process. I start valgrind and mariadb with the following command:

      valgrind --tool=massif --massif-out-file=/var/log/mysql/massif.out.%p /usr/local/mysql/bin/mysqld

      However, when I shutdown MariaDB with a signal or with the SHUTDOWN command, valgrind errors and does not output a massif file. I have attached that error log as well. It seems that MariaDB maybe built wrong? I'm not sure what I'm doing wrong and am reaching the end of my debugging abilities. I attached a log called compile_install that the output from the build. I'm not sure what to look for in it though.

      Another night that might factor in is that I built the docker image on a host system running "Linux 4.18.7-arch1-1-ARCH" and am attempting to run it on a host running "Linux 4.14.67-coreos"

        Attachments

        1. compile_install
          2.01 MB
        2. crash_log
          14 kB
        3. Dockerfile
          2 kB
        4. global_status
          13 kB
        5. global_variables_redacted
          18 kB
        6. Screen Shot 2018-09-26 at 3.35.03 PM.png
          Screen Shot 2018-09-26 at 3.35.03 PM.png
          75 kB
        7. snapshot_1537811421
          529 kB
        8. snapshot_1537963221
          487 kB
        9. snapshot_1538417420
          282 kB
        10. status
          13 kB
        11. valgrind_error
          4 kB
        12. variables_redacted
          18 kB

          Issue Links

            Activity

              People

              • Assignee:
                Elkin Andrei Elkin
                Reporter:
                ljolly Luke Jolly
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated: