XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 2.4.12
    • Fix Version/s: N/A
    • Component/s: N/A
    • Labels:
    • Sprint:
      MXS-SPRINT-139

      Description

      customer reported that their maxscale node OOMed due to increasing memory usage.
      Here is what customer tested and attached config and logs.

      To debug the memory usage issue, I've gone through the following steps.
       
      [root@rnqmax401 ~]# date
      Thu Jan 7 08:41:52 PST 2021
      [root@rnqmax401 ~]#
      Using top I've captured the PID that is taking up all the memory.
      top - 07:57:24 up 2 days, 9:33, 1 user, load average: 0.19, 0.14, 0.11
      Tasks: 156 total, 1 running, 155 sleeping, 0 stopped, 0 zombie
      %Cpu(s): 0.9 us, 0.8 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
      KiB Mem : 98.2/16247560 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ]
      KiB Swap: 54.2/4194300 [|||||||||||||||||||||||||||||||||||||||||||||||||||||| ]
       
      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      54343 maxscale 20 0 15.7g 14.0g 376 S 0.7 90.7 14:05.80 maxscale
       
      Check which process is running with PID 54343. It's the systemctl MaxScale service.
      [root@rnqmax401 ~]# ps -ef | grep 54343
      root 43013 68600 0 08:01 pts/2 00:00:00 grep --color=auto 54343
      maxscale 54343 1 0 Jan05 ? 00:14:07 /usr/bin/maxscale
      [root@rnqmax401 ~]#
       
      Check for the admin port for that MaxScale instance. It is 6111.
      [root@rnqmax401 ~]# grep port /etc/maxscale.cnf
      admin_port = 8991
      port = 3111
      port = 3111
      port = 3111
      port = 6111
      port = 3111
      #port=4442
      port = 3111
      port = 3111
      port = 3111
      port = 9994
      # These listeners represent the ports the
      [root@rnqmax401 ~]#
       
      Before I started this debugging, I've redirected the application connections through a different MaxScale server. As you can see below, there are no active connections while I collected these stats. However, the memory allocated to MaxScale was not released back to the OS. This was captured on Thu Jan 7 08:41:52 PST 2021.
      [root@rnqmax401 ~]# maxadmin -pmariadb -P6111 list servers
      Servers.
      -------------------+-----------------+-------+-------------+--------------------
      Server | Address | Port | Connections | Status
      -------------------+-----------------+-------+-------------+--------------------
      server1 | 10.142.108.141 | 3111 | 0 | Master, Synced, Running
      server2 | 10.142.108.142 | 3111 | 0 | Slave, Synced, Running
      server3 | 10.142.108.143 | 3111 | 0 | Slave, Synced, Running
      server1AD | 10.142.108.141 | 3111 | 0 | Master, Synced, Running
      server2AD | 10.142.108.142 | 3111 | 0 | Slave, Synced, Running
      server3AD | 10.142.108.143 | 3111 | 0 | Slave, Synced, Running
      -------------------+-----------------+-------+-------------+--------------------
      [root@rnqmax401 ~]#
       
      MaxScale usage at Tue Jan 5 16:17:49 PST 2021, this was captured before this debug test.
      -------------------+-----------------+-------+-------------+--------------------
      Server | Address | Port | Connections | Status
      -------------------+-----------------+-------+-------------+--------------------
      server1 | 10.142.108.141 | 3111 | 966 | Master, Synced, Running
      server2 | 10.142.108.142 | 3111 | 966 | Slave, Synced, Running
      server3 | 10.142.108.143 | 3111 | 966 | Slave, Synced, Running
      server1AD | 10.142.108.141 | 3111 | 0 | Master, Synced, Running
      server2AD | 10.142.108.142 | 3111 | 0 | Slave, Synced, Running
      server3AD | 10.142.108.143 | 3111 | 0 | Slave, Synced, Running
      -------------------+-----------------+-------+-------------+--------------------
       
      I've restarted the MaxScale on 2021-01-05 15:02:06 and it was having low usage until Tue Jan 5 16:00:35 PST 2021. Between 16:00 and 16:04, the RAM usage went up from 890 MB to 15327 MB.
      

      • maxscale log is too large to attach so please check support case.

        Attachments

          Activity

            People

            Assignee:
            markus makela markus makela
            Reporter:
            allen.lee@mariadb.com Allen Lee
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration