[MDEV-10814] Feature request: Optionally exclude large buffers from core dumps Created: 2016-09-15  Updated: 2022-01-05  Due: 2017-05-05  Resolved: 2018-04-24

Status: Closed
Project: MariaDB Server
Component/s: Query Cache, Storage Engine - InnoDB
Fix Version/s: 10.3.7

Type: Task Priority: Major
Reporter: Hartmut Holzgraefe Assignee: Oleksandr Byelkin
Resolution: Fixed Votes: 4
Labels: contribution, foundation, patch

Issue Links:
Problem/Incident
causes MDEV-18946 munmap of 1 byte during shutdown is E... Closed
Relates
relates to MDEV-17159 Document what is included in core dumps Closed
relates to MDEV-16605 Always include buf_madvise_do_dump in... Closed
relates to MDEV-20684 innodb: use madvise CORE/NOCORE on Fr... Closed
relates to MDEV-21741 [Warning] InnoDB: Failed to set memor... Closed
relates to MDEV-22186 Please add innodb_buffer_pool_in_core... Closed

 Description   

As the size of core dump files by default mostly equals the process size producing core dumps can become an issue on systems with memory buffers, esp. large innodb_buffer_pool_size.

There needs to be enough file system space to store such large core dumps, and with multi gigabyte files it also takes a non-trivial amount of time to write these, so delaying process restart quite a bit.

And then there's also a security aspect to it: as the core dump contains the complete innodb buffer pool it contains a substantial amount, or even all, of the actual user data in the database.

At the same time actual buffer contents are rarely needed when doing a post mortem analysis (usually we only need stack frames and a few pieces of local data).

So I'm proposing a server option to exclude certain buffers from core dumps by marking them as DONOTDUMP with the madvise() system call.



 Comments   
Comment by Hartmut Holzgraefe [ 2016-09-15 ]

I had a proof of concept for this lying around for quite a while, it's a bit outdated by now though as I last touched it almost a year ago, then somehow got carried away with other things. The github branch for this is

https://github.com/hholzgra/mariadb-server/tree/hartmut-coredump-exclusions

which also contains some usage and implementation documentation:

https://raw.githubusercontent.com/hholzgra/mariadb-server/c7d32f8265183a7f32b8c4a2f59bf39a54aa7c22/Docs/README-core-dump-exclusion

Comment by Daniel Black [ 2017-03-11 ]

Thanks hholzgra. Rebased your work. Good goal.

Comment by Daniel Black [ 2017-03-14 ]

Other candidates for non-dumping:

Thoughts welcome.

This will also need to take account of dynamic innodb buffer pool in 10.2.

Comment by Marko Mäkelä [ 2017-10-18 ]

The InnoDB change buffer consists of persistent pages that reside in the system tablespace. (It may buffer changes to secondary index leaf pages.) Some change buffer pages can reside in the buffer pool, but I guess we’d just want to omit the whole InnoDB buffer pool from the core dump if we want to omit things. It’d be tricky and probably "too little" to omit just the change buffer pages of the buffer pool.

I agree that the InnoDB redo log buffer (log_sys->buf and recv_sys) are rather useless to include in the core dump; I cannot think of a scenario where they could help me debug anything. A crash at crash recovery should normally be repeatable by rerunning recovery on the same files (the state of the files before recovery was attempted). But I guess omitting the log write or read buffers would not save much. By omitting them you could avoid including confidential data. We probably should omit them if we omit the buffer pool from the core dump.

Comment by Hartmut Holzgraefe [ 2018-02-28 ]

Why is this closed? In my proof-of-concept implementation the feature was configurable, to enable full classic core dumps where needed, and also had support for excluding the MyISAM key buffer, . Neither of this I see in

https://github.com/MariaDB/server/commit/b600f30786816e33c1706dd36cdabf21034dc781

Comment by Daniel Black [ 2018-03-02 ]

The fully configurable aspects got rejected in the first review (https://github.com/MariaDB/server/pull/333#issuecomment-295460913).

MyISAM key buffer wasn't done as it didn't track the allocated size needed when you munmap it (resize). (and I didn't care enough about MyISAM)

Query cache DONT_DUMP in progress: https://github.com/MariaDB/server/pull/366

Comment by Marko Mäkelä [ 2018-03-20 ]

I am reassigning to the reviewer of pull request 366 (excluding the query cache from core dumps). As far as I can tell, there is no more InnoDB data structures that could be reasonably omitted from core dumps.

Comment by Oleksandr Byelkin [ 2018-04-17 ]

I checked QC part looks OK, but I do not know how that madwise works, and failed test on github makes me doubting about allowing that changes

Comment by Hartmut Holzgraefe [ 2018-04-17 ]

It should not affect mysqld at runtime at all, it is only evaluated when actually writing a core dump:

From the madvise(2) man page:

MADV_DONTDUMP (since Linux 3.4)
Exclude from a core dump those pages in the range specified by
addr and length. This is useful in applications that have
large areas of memory that are known not to be useful in a
core dump. The effect of MADV_DONTDUMP takes precedence over
the bit mask that is set via the /proc/[pid]/coredump_filter
file (see core(5)).

Comment by Daniel Black [ 2018-04-18 ]

Github / Travis-CI tests have been failing for other reasons (MDEV-15838)

Comment by Marko Mäkelä [ 2018-10-12 ]

kpenza reported in the maria-discuss list that this change may cause error messages to be displayed on startup and shutdown:

Sep 25 10:40:53 srv1 mysqld: 2018-09-25 10:40:53 0 [Warning] InnoDB: Failed to set memory to DODUMP: Invalid argument ptr 0x2aaac5400000 size 2097152

Sep 25 10:41:19 srv1 mysqld: 2018-09-25 10:41:19 0 [Warning] InnoDB: Failed to set memory to DODUMP: Invalid argument ptr 0x2aaac3400000 size 33554432

The reason for this turned out to be a Linux kernel bug, for which danblack contributed a fix for the Linux 4.19 kernel:

mm: madvise(MADV_DODUMP): allow hugetlbfs pages

This was also included in the backport queue for older kernels.

The messages in the MariaDB server error log can be ignored, and they should disappear after upgrading the kernel.

Comment by Daniel Black [ 2018-12-13 ]

Notes for anyone else that comes across this:

The kernel fix has been released in 4.19 and stable kernels 4.18.14, 4.14.76, 4.9.133, 4.4.161, and 3.18.124, and will be in 3.16.62. Redhat has confirmed it will be in their kernels (probably out already).

Without this fix, the impact is that a core dump may not contain all the information.

Generated at Thu Feb 08 07:45:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.