[MDEV-24199] MariaDB Server fails to write a core x out of y times Created: 2020-11-12  Updated: 2023-11-11

Status: Stalled
Project: MariaDB Server
Component/s: Debug
Affects Version/s: 10.5, 10.6
Fix Version/s: 10.5

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Daniel Black
Resolution: Unresolved Votes: 1
Labels: affects-tests

Issue Links:
Relates
relates to MDEV-24217 Add --invoke-on-crash option to mysql... Open
relates to MDEV-25330 fflush(stderr) call improvement in si... Open

 Description   

Loosely defining this bug, as there is a

  • We're take as a baseline any machine which most of the time produces cores correctly, but fails to produce one x out of y crashes
  • "x out of y": Approx 1 out of every 7 to 15 crashes, depending on the scenario, no core is written. Higher on high-load machines.
  • I have clearly observed this issue since I started working on MariaDB. It is not new/recent
  • The problem does not exist in MySQL Server nor in Percona Server
  • Marko has clearly observed the issue in his work as well
  • The issue impacts testing and test reporting stability
  • Issue likely present in all versions

Combining all current thoughts (Marko/Roel):

  • For clarity; the issue happens both without (ref notes Roel below, CLI based) and with MTR (ref notes Marko below, MTR based)
  • Marko+Roel: any tuning (like ulimit -c unlimited and all other server tuning, and any core dump config in /etc/sysctl.conf) makes zero difference. There may been a [very] small? (if any) improvement by setting a correct core pattern in /etc/sysctl.conf, but the issue remains.
  • Roel: I have seen the issue happen plenty of times when only the CLI, the crashing SQL, and no mysqladmin shutdown nor any KILL's where present. Execute crashing SQL at CLI prompt, exit client quickly, check for core (with mysqld clearly crashed as per error log) and no core file is present. Try a one or more repeats and core dump will be there
  • Roel: the issue (i.e. no core file generated) seems to happen more pronounced when existing a CLI quickly after executing some crashing SQL, which seems odd given that core dumps would be mysqld, not mysql bound. Perhaps some "client hold/lock/trigger/status update" exists and affects core dump writing
  • Roel: core dump writing either works or doesn't work, in this way: if the core is generated, it is generated correctly as a whole, if the core is not generated, the file simply doesn't exist. No half-file-writes exists, which seems to somewhat negate my last point above - unless some "client based trigger" needs to be hit mysqld before a given timeout/situation (likely, based on what I have seen) - i.e. it is a "status" which mysqld needs from the client rather then a "lock" which requires a constant client connection
  • Roel: IOW, There seems to be some sort of "delay" before a core is written, as described above. Perhaps best described not as a real delay, but as a "trigger", or "client hold/lock/trigger/status update" as described above.
  • Marko: often seen in combination with MTR aborting execution (without presenting any summary) after too many test failures. (But, that ought to be fixed in 10.2 this week.)
  • Marko: there are also 3 SIGKILLs in MTR that I suspect can ruin not only the core dump writes but also rr record runs (by killing the rr process)

Numerous attempts at clarifying the issue further have failed.

Current summary (Marko+Roel): good enough info to now log a bug, but not good enough info to find a fix.

This issue in the hope that others have other experiences then what is already mentioned above in the hope to get this fixed.

Further thoughts

  • Perhaps a script which quickly brings up server, crashes it with SQL at the CLI, exists immediately from the CLI and then counts the number of cores written and loops may be able to prove a better x out of y ratio, but it may not help with finding the real cause


 Comments   
Comment by Marko Mäkelä [ 2020-11-12 ]

For the record, the problematic SIGKILLs are identified by this patch. But, applying this patch will cause many tests to fail. I have not investigated the causes. My main motivation for using this patch has been to be able to get complete rr record traces for something. Apparently mtr is killing not only the mysqld or mariadbd process, but also the rr process, in some of these cases.

diff --git a/client/mysqltest.cc b/client/mysqltest.cc
index 417d3615995..48b8f132eb2 100644
--- a/client/mysqltest.cc
+++ b/client/mysqltest.cc
@@ -5141,7 +5141,7 @@ void do_shutdown_server(struct st_command *command)
     if (timeout)
       (void) my_kill(pid, SIGABRT);
     /* Give server a few seconds to die in all cases */
-    if (!timeout || wait_until_dead(pid, timeout < 5 ? 5 : timeout))
+    if (!timeout || wait_until_dead(pid, timeout < 60 ? 60 : timeout))
     {
       (void) my_kill(pid, SIGKILL);
     }
diff --git a/mysql-test/lib/My/SafeProcess/safe_process.cc b/mysql-test/lib/My/SafeProcess/safe_process.cc
index 4d0d1e2a3a0..abc167a4300 100644
--- a/mysql-test/lib/My/SafeProcess/safe_process.cc
+++ b/mysql-test/lib/My/SafeProcess/safe_process.cc
@@ -144,7 +144,7 @@ static int kill_child(bool was_killed)
   message("Killing child: %d", child_pid);
   // Terminate whole process group
   if (! was_killed)
-    kill(-child_pid, SIGKILL);
+    kill(-child_pid, SIGABRT);
 
   pid_t ret_pid= waitpid(child_pid, &status, 0);
   if (ret_pid == child_pid)
diff --git a/mysql-test/lib/v1/mtr_process.pl b/mysql-test/lib/v1/mtr_process.pl
index fd9f3817699..ee9a370c467 100644
--- a/mysql-test/lib/v1/mtr_process.pl
+++ b/mysql-test/lib/v1/mtr_process.pl
@@ -456,8 +456,8 @@ sub mtr_kill_leftovers () {
         my $retries= 10;                    # 10 seconds
         do
         {
-          mtr_debug("Sending SIGKILL to pids: " . join(' ', @pids));
-          kill(9, @pids);
+          mtr_debug("Sending SIGABRT to pids: " . join(' ', @pids));
+          kill(6, @pids);
           mtr_report("Sleep 1 second waiting for processes to die");
           sleep(1)                      # Wait one second
         } while ( $retries-- and  kill(0, @pids) );

The first hunk may be unnecessary. That code is present since 10.3, and it will make the shutdown_server command send SIGABRT on initial timeout, and 5 seconds after that, SIGKILL. I did not check if that 5-second timeout is actually insufficient. I think that we should add some timeout mechanism to each of these SIGKILL, to precede them with SIGABRT.

Comment by Roel Van de Paar [ 2020-11-12 ]

With thanks to wlad, --skip-stack-trace (which negates the default-on --stack-trace) will skip writing a stacktrace to stderr. It may (or may not) provide a workaround to the issue. To be tested.

Comment by Roel Van de Paar [ 2021-04-02 ]

A heavily overloaded server (load average 500+) may cause the core not to be written.
Also, the latest few iterations of 10.6 seem to be worse at writing a core consistently. Issue is repeatable about 50/50 on high load machine.

Comment by Roel Van de Paar [ 2021-04-02 ]

In both cases (a core file is written, or a core file is not written), the output in the error log is the same:

Writing a core file...
Working directory at /test/MD020421-mariadb-10.6.0-linux-x86_64-dbg/data
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            unlimited            unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             unlimited            unlimited            processes
Max open files            1048576              1048576              files
Max locked memory         unlimited            unlimited            bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       unlimited            unlimited            signals
Max msgqueue size         unlimited            unlimited            bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
Core pattern: |/usr/share/apport/apport %p %s %c %d %P %E

Yet one run vs another will have the core or not (on currently a high load average (> 500) machine), and plenty of space is available.
Also observed is that some testcases seem more or less prone to the issue.

Comment by Alexey Bychko (Inactive) [ 2021-04-02 ]

I think 500+ LA is not normal. the higher the LA - the higher timeout is needed. timeout is not configurable parameter, so we have only compiled-in value. and it can't be infinite

Comment by Roel Van de Paar [ 2021-04-02 ]

It has thus far not been shown that there actually is such a hardcoded timeout, though it seems there is one. The one by Marko above relates to MTR only.

Comment by Roel Van de Paar [ 2021-04-14 ]

With thanks to danblack, added this to the build script for temporary testing (i.e. disable in-code SIG handling):

  # Temporary patch (Ref discussion DB/RV on 14-04-2021) to see if this fixes the core dumping issues (ref MDEV-24199) 
  sed -i -e 's:\(sigaction(SIG[SABIF]\)://\1:' sql/mysqld.cc

Comment by Roel Van de Paar [ 2021-04-17 ]

Implementing this change at first look has at least not made things worse, and possibly improved things.
However, it has caused a little havoc in text filtering strings. Fixed in new_text_string code.
Hopefully not too bad, though some uniqueID's may have to be refreshed. ramesh fyi as discussed.

Comment by Roel Van de Paar [ 2021-04-27 ]

UniqueID's have proven stable post-patch.
Just saw the issue on 10.6.1 commit 9db14e93acc4ec9023d50686c66dbef7d4d8c15c once (using the testcase from MDEV-25534 against the CLI), which had the build mod described above in place. Enough diskspace was available.

Comment by Roel Van de Paar [ 2021-05-04 ]

Lowered prio now as situation is seems better if not 99% OK post-patch approach. Not fully as at least once caught a missing core. Also made framework improvements to cater for either patched or non-patched builds. It would be great if in time MD can be compared with MS/PS to see if their core dump solutions could be ported or if ours could be adjusted to match theirs.

Comment by Roel Van de Paar [ 2023-11-08 ]

Though we are patching (using Daniel's patch) for testing, which resolves the issue for testing, the underlying bug still exists. danblack Do you have further thoughts on how the code in sql/mysqld.cc could be improved to ensure cores are written correctly more often (i.e. for users / without applying this patch)?

Comment by Marko Mäkelä [ 2023-11-09 ]

I think that something similar to the second hunk of danblack’s patch was implemented in this commit in August 2022.

Comment by Roel Van de Paar [ 2023-11-11 ]

Thank you marko! Any final thoughts danblack?

Generated at Thu Feb 08 09:28:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.