[MDEV-14460] Memory leak with only SELECT statements Created: 2017-11-21 Updated: 2018-04-19 Resolved: 2018-04-12 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - RocksDB |
| Affects Version/s: | 10.2.9 |
| Fix Version/s: | 10.2.15 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Julien Muchembled | Assignee: | Sergei Petrunia |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Debian 8. 16 GB RAM. 8 CPU (Xeon E5-1620 v2). |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
The same IO-bound scenario was run on 2 identical machines, one with RocksDB and the other with TokuDB. In a Zope application (ERP5), using NEO/MariaDB as storage, we filled the disks with a few data streams. The memory leak happened when reading a single whole data stream. The result is that ERP5/NEO does 2 queries in loop. Here is an example from general log:
More about NEO's architecture is described at https://www.nexedi.com/blog/NXD-Document.Blog.Optimising.MariaDB.Big.Data The data stream is 1.5TB uncompressed, taking about 650MB on-disk. RocksDB got OOM-killed when 1TB was read. TokuDB respected its memory constraints until the end. Each machine has 2 disks of 2TB, each one with a MariaDB DB that holds half of the Zope DB. IOW, we have 2 RocksDB databases of about 2TB. 8GB of RAM for the 2 mysqld (rocksdb_block_cache_size = 2G). |
| Comments |
| Comment by Michael Widenius [ 2017-12-05 ] | |||||||||
|
Can you please make a full test case that we can reproduce? Alternatively is to link MariaDB with tcmalloc and run the test on your machines. This would give us https://mariadb.com/kb/en/library/debugging-a-running-server-on-linux/ | |||||||||
| Comment by Julien Muchembled [ 2017-12-07 ] | |||||||||
|
If the pattern (when filling or reading tables) matters, writing a test case can be a lot of work. For the moment, I prefer the other way. We use jemalloc. Something must be missing on their wiki because I was unable to get any profiling data. So I switched to tcmalloc. [edit: removed, too many mistakes from me] | |||||||||
| Comment by Julien Muchembled [ 2017-12-07 ] | |||||||||
|
Reuploaded. Apparently, I must call pprof with a single .heap file. | |||||||||
| Comment by Julien Muchembled [ 2017-12-07 ] | |||||||||
|
I don't know if it matters but I've just realized that all select are done in a single transaction. | |||||||||
| Comment by Sergei Petrunia [ 2017-12-07 ] | |||||||||
|
Looking at positions where memory is leaked:
So, most of the leaks are outside ha_tokudb.so, it seems?
| |||||||||
| Comment by Julien Muchembled [ 2017-12-07 ] | |||||||||
|
Sorry, for the confusion. I edited my first reply to remove parts that aren't pertinent anymore. Can you please check the svg I reuploaded ? All addresses are resolved. | |||||||||
| Comment by Sergei Petrunia [ 2017-12-11 ] | |||||||||
|
(jmuchemb, thanks for the re-uploaded file. It helped) Discussed this with Herman Lee. Discussion results:
will create a new savepoint for each select. A transaction doing lots of select will end up using a lot of memory. This can be fixed, for example, by not creating a savepoint when the statement is a SELECT statement. We can see this in ha_rocksdb::external_lock() call. That call is done for each table, for statements that do both reads and writes (e.g. INSERT ... SELECT), the first call is the external_lock(F_WRLCK) for the table that we're going to write to. Note: if we haven't allocated a savepoint at statement start, we should not try to remove it at statement end. | |||||||||
| Comment by Sergei Petrunia [ 2018-03-30 ] | |||||||||
|
Started attempt to implement it and encountered this: https://github.com/facebook/mysql-5.6/issues/802 (not related to memory consumption as such but shows that both us and upstream have RollbackToSavepoint() call in the wrong place). | |||||||||
| Comment by Sergei Petrunia [ 2018-04-05 ] | |||||||||
|
The patch for issue #802 here: https://github.com/spetrunia/mysql-5.6/commit/612ec75814f30076f53d294cd0267533c807022f is also a fix for this MDEV. That tree is based on facebook/mysql-5.6, but the patch should apply to MariaDB as well. Next steps: I'll discuss this the upstream and then cherry-pick into MariaDB (so that we get it in MariaDB sooner than the merge from upstream happens) | |||||||||
| Comment by Sergei Petrunia [ 2018-04-12 ] | |||||||||
|
Applied the patch for upstream issue #802. It fixes this issue as well. |