[MXS-1977] Maxscale 2.2.6 memory leak Created: 2018-07-13 Updated: 2018-12-18 Resolved: 2018-07-17 |
|
| Status: | Closed |
| Project: | MariaDB MaxScale |
| Component/s: | mariadbbackend, readwritesplit |
| Affects Version/s: | 2.2.6 |
| Fix Version/s: | 2.2.12 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Daniel Snow | Assignee: | markus makela |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Debian GNU/Linux 8 \n \l |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
Good day! Our maxscale after upgrade from 2.1 to 2.2.6 started suffering from an extremely high memory consumption, that after some time makes system unusable. Using limits.conf (systemd resource control https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html) to limit memory usage of process leads process to backtracing and re-starting. Here is our MaxScale config:
Here is backtrace of out of memory maxscale: 2018-07-13 20:19:49 alert : (1049629) Fatal: MaxScale 2.2.6 received fatal signal 6. Attempting backtrace. I read the next minor releases notes of 2.2.x and haven't found any fixes of such leaks described. Hope someone can assist or help to dig into the root cause of such memory leaking. Here is our memory usage statistics from one of not very busy production servers. 11GB of memory for proxy is very strange operating. root@wifidb3:~# cat /proc/7481/status |
| Comments |
| Comment by markus makela [ 2018-07-13 ] | |
|
I'd recommend upgrading to MaxScale 2.2.11 even if no explicit fix to your problem is found. By upgrading to the latest GA version, we'll have a know good starting point and we verify that no known bugs interfere with the problem resolution process. Once we know whether 2.2.11 suffers from the same problem, we can proceed by attempting to create a minimal, verifiable test case. I'm suspecting that the main culprit in this case is the readwritesplit router or at least that's where the stacktrace points to. By the looks of it, the code in question is the new-in-2.2 prepared statement code for binary protocol prepared statements. Checking the following points would be of immense help:
| |
| Comment by Daniel Snow [ 2018-07-13 ] | |
|
1. Yes. http://php.net/manual/en/pdo.prepare.php > I'm suspecting that the main culprit in this case is the readwritesplit router or at least that's where the stacktrace points to. By the looks of it, the code in question is the new-in-2.2 prepared statement code for binary protocol prepared statements. I think so too. I've noticed a difference in default behavior of STMT handling comparing to Maxscale 2.1 - and returned in readwritesplit parameter strict_multi_stmt to "true", but this don't helped. I'll update to 2.2.11 and report if something will change in memory consumption. | |
| Comment by Daniel Snow [ 2018-07-14 ] | |
|
I've tried 2.2.11 - unfortunately problem still exists. Take a look to gradual growth of RSS consumption over the short period of time. 1Gb for a day - it's too much and fast growth. | |
| Comment by Daniel Snow [ 2018-07-15 ] | |
|
I rolled back to 2.1.17 and no memory leaking in this version. Memory consumption stay in 16MB range. | |
| Comment by markus makela [ 2018-07-15 ] | |
|
A small memory leak was found in the code that processes session commands. A fix to this leak is available on the 2.2-markusjm branch on GitHub. | |
| Comment by Daniel Snow [ 2018-07-15 ] | |
|
Thank you, I'll check. May you also make sure that strict_multi_stmt=true is not broken in 2.2.x? In logs, when STMT with strict_multi_stmt=true goes through MaxScale 2.1.x, logger says: And in MaxScale 2.2.x logger says: This is seems like a bug too. | |
| Comment by markus makela [ 2018-07-16 ] | |
|
strict_multi_stmt controls what happens after a multi-statement query is executed. For example, the following query has two SQL statements in it:
The whole UPDATE t1 SET a = 1; SELECT * FROM t1 WHERE a = 1; query will be sent at one time without waiting for the first one to return a result. Due to a few technical limitations and practical complexity, readwritesplit will route all queries that contain more than one statement to the master. With strict_multi_stmt=false (default), normal operation continues after the statement has been executed on the master. With strict_multi_stmt=true, the router assumes that a multi-statement query causes changes in the session state and proceeds to route all queries afterwards to only the master. In this mode the router session is effectively locked to the master server. The routing of COM_STMT_PREPARE changed between 2.1 and 2.2 from being routed to the master to being routed to all server. This was done to allow load balancing of prepared statements so the behavior you observed would be expected. Can you clarify what behavior you use strict_multi_stmt for and what you'd expect to see with it? |